What's the purpose of array size inside subroutine arguments?

Hello everyone,

I’ve seen a senior Fortran programmer denote the array sizes as residualError(numEqns*numCells) shown below.

I tried this, and it doesn’t do any bounds/error checking.

Is it purely for decoration, or self-documentation?

I guess, it helps to understand the code better.

Thanks.


! The maximum allowed size of a is 5
subroutine set_elem(a)
    real, intent(inout) :: a(*)  ! an array of size 5 is expected
    a(6) = 42.0    ! <-- programmer made an error
end subroutine

subroutine set_elem_sized(n,a)
    integer, intent(in) :: n
    real, intent(inout) :: a(n)
    a(6) = 42.0 
end subroutine

program test
implicit none (type,external)
external :: set_elem, set_elem_sized
real :: a(5)

call set_elem(a)    ! :bomb:,  out-of-bounds write

call set_elem_sized(5,a)   ! produces warning

end program
$ gfortran -Wall -fcheck=all test_size.f90 
$ ./a.out
At line 11 of file test_size.f90
Fortran runtime error: Index '6' of dimension 1 of array 'a' above upper bound of 5

Error termination. Backtrace:
#0  0x102497aae
#1  0x1024987a5
#2  0x102498d15
#3  0x102040bed
#4  0x102040cc1
#5  0x102040d06

Without the explicit bound argument, we just created an out-of-bounds write, an exploitable vulnerability.

1 Like

What fortran-lang terms explicit-shape arrays (the kind of array arguments you showed) existed in the language before the now preferred assumed-shape array, so the senior Fortran programmer could be using explicit-shape arrays out of habit. A technical consideration is that interoperability with C of Fortran assumed-shape arrays was added after interoperability with explicit-shape arrays and is more complicated. In general it may be more difficult for other languages to deal with Fortran subroutines having assumed-shape arrays, since the array dimensions are not passed explicitly.

2 Likes

If the programmer calls the subroutine with an incorrect length parameter, then that also would be an exploitable vulnerability, and one that the compiler probably will not catch.

If a programmer tries hard enough, he can shoot himself in the figurative foot.

This is why every tutorial I give I make the statement “never use explicit shape or assumed size (i.e. (*)) array arguments. Always use assumed shape (i.e. (:)).”

2 Likes

Never?

I kind of like that the dimensions of the arrays are included in the argument, so it helps to understand the code better.

1 Like

@ivanpribec
Interestingly, Silverfrost FTN95 will give a run-time error report at line 4 when using /check as it does provide additional debugging information for memory size allocated / available for real, intent(inout) :: a(*). This can be a useful debugging capability when using assumed-size arrays.

Personally, I think this rule is too strong, and both assumed-size and assumed-shape arrays have valid use-cases.

When it comes to explicit-size array arguments (including assumed-size)

  • the known length and array contiguity can offer better optimization opportunities
  • very commonly used for interoperability with C APIs

Just to give you a simple example:

subroutine sqrsum(a,b,c)
real, intent(in) :: a(:), b(:)
real, intent(out) :: c(:)
c = a**2 + b**2
end subroutine

subroutine sqrsumn(n,a,b,c)
integer, intent(in) :: n
real, intent(in) :: a(n), b(n)
real, intent(out) :: c(n)
c = a**2 + b**2
end subroutine

When compiled with gfortran -O2 -march=skylake the instructions produced are:

  1	sqrsum_:                                  | sqrsumn_:
  2	        push    rbx                       |         movsx   rdi, DWORD PTR [rdi]
  3	        mov     ebx, 1                    |         test    edi, edi
  4	        mov     r8, QWORD PTR [rdx+40]    |         jle     .L5
  5	        mov     r10, QWORD PTR [rdi+40]   |         sal     rdi, 2
  6	        mov     r9, QWORD PTR [rsi+40]    |         xor     eax, eax
  7	        mov     r11, QWORD PTR [rdi+56]   | .L3:
  8	        test    r8, r8                    |         vmovss  xmm1, DWORD PTR [rdx+rax]
  9	        mov     rax, QWORD PTR [rdx]      |         vmovss  xmm0, DWORD PTR [rsi+rax]
 10	        mov     rcx, QWORD PTR [rdi]      |         vmulss  xmm1, xmm1, xmm1
 11	        cmove   r8, rbx                   |         vfmadd132ss     xmm0, xmm1, xmm0
 12	        test    r10, r10                  |         vmovss  DWORD PTR [rcx+rax], xmm0
 13	        mov     rdx, QWORD PTR [rsi]      |         add     rax, 4
 14	        cmove   r10, rbx                  |         cmp     rdi, rax
 15	        test    r9, r9                    |         jne     .L3
 16	        cmove   r9, rbx                   | .L5:
 17	        sub     r11, QWORD PTR [rdi+48]   |         ret
 18	        js      .L11
 19	        sal     r10, 2
 20	        xor     esi, esi
 21	        sal     r9, 2
 22	        sal     r8, 2
 23	.L6:
 24	        vmovss  xmm1, DWORD PTR [rdx]
 25	        vmovss  xmm0, DWORD PTR [rcx]
 26	        mov     rdi, rsi
 27	        add     rcx, r10
 28	        inc     rsi
 29	        add     rdx, r9
 30	        vmulss  xmm1, xmm1, xmm1
 31	        vfmadd132ss     xmm0, xmm1, xmm0
 32	        vmovss  DWORD PTR [rax], xmm0
 33	        add     rax, r8
 34	        cmp     rdi, r11
 35	        jne     .L6
 36	.L11:
 37	        pop     rbx
 38	        ret

See how the assumed-shape version does extra register shuffling at the beginning, whereas the contiguous versions proceeds almost immediately into the computational loop. If you have lots of assumed-shape arguments this contributes to a higher register pressure, and some arguments may need to be pushed to the stack. In contrast the known-size version uses only four registers,

  • rdi for the address at which the common size n is stored,
  • rsi for the base address of array a
  • rdx for the base address of array b
  • rcx for the base address of array c

In the computational loop .L6, the assumed-shape version is 13 instructions, whereas the contiguous version has 8. The additional instructions track the element position using the registers rdx, rcx, and rax, as assumed-shape implies the arguments could have different strides.

When compiled with with -O3 -march=skylake -fopt-info, the message

/app/example.f90:4:15: optimized: versioned this loop for when certain strides are 1

says the compiler generated two code-paths, one optimized for the case when both arguments are contiguous, and a second one, using a sequential loop when one or both of the strides are larger than one.

NVIDIA also recommends avoiding assumed-shape arrays when it comes to GPU kernels (that is CUDA kernels, or functions called inside of OpenACC regions). For more details see the talk by David Appelhans on Best Practices for Programming GPUs using Fortran, OpenACC, and CUDA | NVIDIA On-Demand. Here is a snapshot taken from that video:


There are however very valid use-cases for assumed-shape arrays too. One of my favourites is for situations where both SoA and AoS data-structures are commonly used. Say you have some type of spatial search tree in 3-d:

subroutine search(x,y,z, ...)
real, intent(in) :: x(:), y(:), z(:)
! ...
end subroutine

By using assumed-shape arrays, users can do the following,

type :: point
   real :: x, y, z
end type

type :: point_collection(n)
   real, len :: n
   real :: x(n), y(n), z(n)
end type

type(point), allocatable :: pa(:)
type(point_collection(n=:)), allocatable :: pc

! ... initialize pa and pc

call search(pa%x,pa%y,pa%z, ...)  ! AoS

call search(pc%x,pc%y,pc%z, ...)  ! SoA

The code-path generated for the AoS layout won’t be optimal, but at least it doesn’t make a temporary copy, and the locality of the data accesses remains good.

Edit: for anyone seeking to combine two files for a side-by-side comparison, I used this script:

#!/usr/bin/env bash
# Combines two lines side-by-side and numbers them
# Usage: combine.sh <file1> <file2>
nl -w3 <(paste $1 <(awk '{print "| "$0}' $2) | (column -s $'\t' -t))
6 Likes

When teaching, I’m opting for simple rules, and fewer foot-guns. Including the dimensions (explicit shape) may seem like it makes the code easier to understand, but I find it leads to a false sense of protection and bugs far more often that it helps.

For experienced programmers, working in cases where they have evidence that it does in fact make a significant difference to performance and is not a public facing API, I have no concern with saying explicit or assumed shape is ok. But of course I don’t really have to teach that group.

Roughly speaking, I tend to follow these usage rules

  • high-level, set-up, array-expression-based, convenience → assumed-shape
  • low-level, kernel, loop-based, performance → explicit- or assumed-size

When it comes to kernel-like code, the known shapes and sizes, as well as Fortran’s aliasing rules make the compiler’s job easier. For example the inner-most kernel in the time-loop of a PDE solver would use explicit-shape arrays, but all the set-up code (i.e. the public API), that gets executed rarely can use assumed-shape arrays.

I found this blog, Can You Trust a Compiler to Optimize Your Code?, puts it nicely:

Compilers are myopic — they have a hard time reasoning about code outside of the current function and values not held in the local variables. Inlining and scalar replacement of aggregates are two optimizations to remedy the situation. Zero cost abstractions work by expressing opportunities for guaranteed optimizations in the language’s type system.

2 Likes

Compilers do bounds checking for it:

program array_test

real :: A(10)

A = 0

call f(size(A), A)

print *, A

contains

    subroutine f(n, A)
    real, intent(inout) :: A(n)
    A(n+1) = 1
    end subroutine

end program

This gives:

$ gfortran -g -fcheck=all a.f90 && ./a.out
At line 15 of file a.f90
Fortran runtime error: Index '11' of dimension 1 of array 'a' above upper bound of 10

Error termination. Backtrace:
#0  0x102be3bd3 in ???
#1  0x102be47a7 in ???
#2  0x102be4be3 in ???
#3  0x1026e3bf7 in f
	at /tmp/a.f90:15
#4  0x1026e3c9f in array_test
	at /tmp/a.f90:7
#5  0x1026e3d7b in main
	at /tmp/a.f90:9

I used to teach that too, but I changed my mind. Now I very often prefer explicit shape. It’s self documenting, and often enforced by the compiler (compilers are not perfect, but I think they could be perfect here, and hopefully they will in the future), and you can restrict the dimensions more than with assumed shape arrays, and typically you want to pass “model parameters” (the array sizes) anyway as arguments. Here is a recent example where I did exactly that: fastGPT/gpt2.f90 at c2148fbd909c82ec72eaccc00d8ddc51e9106144 · certik/fastGPT · GitHub, as you can see, almost every function has explicit shape arguments.

1 Like

@FedericoPerini has been creating a modernized Lapack.

I suggested replacing assumed-size arguments in Lapack with explicit-shape. Do people agree with this suggestion? Even if it’s a good idea, there’s a question of how much work it would be to implement.

2 Likes

Do they really?

program array_test

real :: A(10)

A = 0

call f(size(A)+1, A)

print *, A

contains

    subroutine f(n, A)
    real, intent(inout) :: A(n)
    A(n) = 1
    end subroutine

end program
$ example.f90 -o example.exe && ./example.exe
   0.00000000       0.00000000       0.00000000       0.00000000       0.00000000       0.00000000       0.00000000       0.00000000       0.00000000       0.00000000

The above is exactly the kind of mistake I see all the time. People expect the compiler to yell about the size mismatch. It doesn’t. The bounds checking is a secondary thing.

2 Likes

The above example also doesn’t bounds check for me in GFortran:

$ gfortran -g -fcheck=all a.f90 && ./a.out

   0.00000000       0.00000000       0.00000000       0.00000000       0.00000000       0.00000000       0.00000000       0.00000000       0.00000000       0.00000000    

However I consider that a bug in the compiler. Can you try NAG? I know it has an option to even bounds check “(*)” arrays, so I would expect it to bounds check the above case.

Indeed, nagfor catches it.

$ nagfor -C=all example.f90 -o example.exe && ./example.exe 
NAG Fortran Compiler Release 7.1(Hanzomon) Build 7144
[NAG Fortran Compiler normal termination]
Runtime Error: example.f90, line 13: Invalid reference to procedure ARRAY_TEST:F - Dummy array A (number 2) has 11 elements but actual argument only has 10 elements
Program terminated by fatal error

That said, Intel doesn’t

$ ifx -check all,nouninit example.f90 -o example.exe && ./example.exe
  0.0000000E+00  0.0000000E+00  0.0000000E+00  0.0000000E+00  0.0000000E+00
  0.0000000E+00  0.0000000E+00  0.0000000E+00  0.0000000E+00  0.0000000E+00

and neither does Cray

$ ftn -eCD example.f90 -o example.exe && ./example.exe
ftn-1069 ftn: WARNING in command line
  The ipa optimization option is ignored, if the debugging level is 0.
 10*0.

nor nvfortran

$ nvfortran -C example.f90 -o example.exe && ./example.exe 
    0.000000        0.000000        0.000000        0.000000     
    0.000000        0.000000        0.000000        0.000000     
    0.000000        0.000000    

nor flang-new

$ flang-new -g example.f90 -o example.exe && ./example.exe 
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.

nor lfortran

$ lfortran example.f90 -o example.exe && ./example.exe
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 

So given that no free compiler catches this I recommend to avoid it.

3 Likes

You can write a subroutine with explicit-shape array arguments but for safety wrap it with a subroutine that has assumed-shape arguments and checks that the array dimensions are correct. Here is an example.

module matmult_mod
implicit none
integer, parameter :: dp = kind(1.0d0)
contains
subroutine matmult_assumed(a, b, c, ierr)
real(kind=dp), intent(in)  :: a(:,:) ! (n1, n2)
real(kind=dp), intent(in)  :: b(:,:) ! (n2, n3)
real(kind=dp), intent(out) :: c(:,:) ! (n1, n3) -- matrix product of a and b
integer      , intent(out) :: ierr
integer                    :: n1, n2, n3
n1 = size(a,1)
n2 = size(a,2)
n3 = size(b,2)
! check that arrays conform
ierr = findloc([size(b,1) == n2, size(a,1) == size(c,1), size(b,2) == size(c,2)], .false., dim=1)
if (ierr /= 0) return
call matmult_explicit(n1, n2, n3, a, b, c)
end subroutine matmult_assumed
!
subroutine matmult_explicit(n1, n2, n3, a, b, c)
integer      , intent(in)  :: n1, n2, n3
real(kind=dp), intent(in)  :: a(n1,n2)
real(kind=dp), intent(in)  :: b(n2,n3)
real(kind=dp), intent(out) :: c(n1,n3) ! matrix product of a and b
c = matmul(a,b)
end subroutine matmult_explicit
!
end module matmult_mod
!
program main
use matmult_mod
implicit none
integer, parameter :: n1=3, n2=5, n3=4
real(kind=dp) :: a(n1,n2), b(n2,n3), c(n1,n3)
integer :: ierr
call random_number(a)
call random_number(b)
call matmult_assumed(a,b,c,ierr)
if (ierr == 0) then
   print*,"max error =", maxval(abs(c - matmul(a,b)))
else
   print*,"ierr =",ierr
end if
call matmult_assumed(a(2:,:),b,c,ierr) ! wrong sizes
print*,"here, ierr =",ierr
end program main

output:

 max error =   0.0000000000000000     
 here, ierr =           2
1 Like

As a user, what I want is a compiler enforced Debug mode, perhaps we can call it “pedantic”, that catches all segfaults and undefined behavior with a combination of compile time and runtime checking. I managed to get a segfault from the NAG compiler, but I had to try. I think eventually it can become bulletproof.

Consequently, I recommend to fix free compilers to catch it. :slight_smile: We’ll work on all such checks in LFortran after we deliver beta, compiling valid codes.

In my opinion, Fortran has many such deficiencies, that should be fixed with better compilers and tooling. Avoiding it does not fix the problem for me, since the “(:)” arrays do not couple the “model parameters” with dimensions, so while they are checked today, the checks are too late (such as when you actually index the array in some deeply nested matmul like subroutine and it happens to be out of bounds), which makes the development and debugging of codes harder. I want the check to trigger immediately at function call of the top level function, like the NAG compiler delivers today.

2 Likes

:100: I do think it would require a lot more work from the runtime, but I’d love to have a “processor” that was much better at it. It would be incredible if they could be specific too. Like “You never initialised that variable that got passed through like 5 procedure calls” or “you deallocated this variable over there, but now you’re trying to reference it” or the holy grail “warning: you deallocated this target, but these other pointers were still associated with it”

2 Likes

Silverfrost FTN95 runs on Windows (and on Linux, using Wine, etc.), and catches the error:


Attempt to call a routine with argument number two containing too few array elements at address 1a0093e9

Within file bchk.exe
in ARRAY_TEST~F at address 160
in ARRAY_TEST in line 7, at address 79

Lahey-Fujitsu 7.1 does, as well:

jwe0316i-w The array size of dummy argument 2 is greater than the usable size of the actual array (actual argument A: 40 bytes, dummy argument A: 44 bytes).
 error occurs at _array_test@_f_                 line 13 loc 004011bb offset 0000009c
1 Like

Your overall point stands, but never forget g95 :slight_smile: . It gives

Exception: Access Violation
At line 9 of file xarray_test.f90 (Unit 6)