Impossibility to infer array slices from `CFI_cdesc_t`

Prompted by this suggestion and previous discussion I’ve explored whether it’s possibe to extract array slicing information out of CFI_cdesc_t using ISO_Fortran_binding.h.

The aim is to extract array slice information such that when calling a BLAS/LAPACK function with array slices,

  call subroutine solve(a(1:10:4,3:15:2))

it may be used avoiding copyin/copyout internally providing a pointer reference and proper increments:

 subroutine solve(a)
   real, intent(inout) :: a(:,:)
   
   s = strides(a) ! returns [4,2]
   call lapack_fun(a(1,1),incx=s(1),...,incx=s(2))

Though we can return all information in CFI_descr_t back to Fortran nicely, I think the problem is impossible to solve, because different slicings can return the same address information. For example:

        cfi1 = array_descriptor(a(1:5:3,1:10:6))
        call CFI_print(cfi1)
        cfi2 = array_descriptor(b(1:6:3,1:10:5))
        call CFI_print(cfi2)

returns two CFI objects with the same strides:

# a(1:5:3,1:10:6)
base address  : 16D5030D8
total length  : 4
CFI version   : 1
Variable type : 1027
Rank          : 2
   1) [1:2], padded every 3 elements
   2) [1:2], padded every 30 elements
# a(1:6:3,1:10:5)
base address  : 16D502F50
total length  : 4
CFI version   : 1
Variable type : 1027
Rank          : 2
   1) [1:2], padded every 3 elements
   2) [1:2], padded every 30 elements

both have 30-element strides (5 x 6 slices vs. 6 x 5 slices), so the stride of dimension 2 is undefined in terms of number of elements. @PierU any thoughts?

What are the original shapes of a(:,:) and b(:,:) ?

they’re real :: a(5,10),b(6,10)

As far as I know, LAPACK does handle vectors and matrices differently:

  • vectors usually have associated inc* arguments
  • matrices usually have associated ld* arguments, but not inc* ones. That is, the elements in a column are assumed to be stored contiguously, but some padding is allowed between columns. I don’t remember cases with matrices having an inc* associated argument

It means that there’s no way to avoid copy-in/copy-out if one inputs a matrix that has a stride /= 1 along the first dimension, such as a(1:5:3,1:10:6) or b(1:6:3,1:10:5). This is illustrated in my symbolic code, the inc /= 1 case takes the copy-in/copy-out route…

For the matrices, what is important is to detect the offset between columns (30 elements in your case), which can be used the ld* arguments.

Yes, I agree it makes sense to have this for arrays. For matrices, I believe it’s not going to be appealing to pass lda to a high-level API: in that case, it’s just better if the user calls the standard LAPACK API IMHO.

For 1D arrays, this works:

So, while I agree this looks extremely appealing, isn’t it easier and maybe even acceptable to ask the user of the API to give such information with optional array(s)? Say:

call solve( a , [1,10,4], [3,15,2] )

How about doing it like this:

 subroutine solve(a,x)
   real, intent(inout) :: a(:,:), x(:)

   if (is_contiguous(a) .and. is_contiguous(x)) then
     ! ... regular lapack path
   else
       sx = (loc(x(2)) - loc(x(1)))/sizeof(x(1))    ! non-standard, but common, extensions
       ! ...
   end if
end subroutine

Personally, I wonder if copy-in/copy-out isn’t the simpler (and faster) option. I guess it depends in practice on the strides.

PS: is_contiguous doesn’t support checking along a specific dimensions. If we look at the interface of DGESV, an assumed-shape array can only have padding along the first dimension but not along the second one. (Edit: I can see how skipping a column, could be added to the ld* argument, but who would store a linear system this way?)

We don’t have to, the ld* can be easily detected from the array descriptor (which you did: it’s 30 in your examples).

Thanks all @ivanpribec and @hkvzjal and @PierU:

It’s possible - and I managed to get it working - to get a leading_dim(mat) as @PierU and @ivanpribec are suggesting: i.e. for a 10x10 matrix,

But it comes with these challenges, and I’d be glad to have your opinion on these:

  1. the C-interoperable functions are not pure, i.e. we must pass the matrix as intent(inout), target so if we go down this path, we won’t be able to make any of the interface pure either. If it was me, I’d probably try to direct more effort toward making LAPACK and BLAS both pure (some more work is needed to remove data statements and external I/O)

  2. Ivan’s solution is pure, but I’m not sure if it’s standard conforming, so some compilers may complain. Same goes with the ISO_Fortran_binding.h, which older compilers are not shipped with

  3. The case for fast matrix slice solutions comes if one has to repeatedly solve sliced linear systems, which looks like a poor design decision anyways…?

Not that I dislike the information that comes from CFI_cdesc_t, but now that I’ve tried it, and seen the implications, I think this is too much over-engineering of the interface for the current stage of the library. What do you think?

1 Like

I don’t get what you mean… All functions can be pure:

use iso_c_binding
implicit none
real :: a(10,10)

interface
    pure function leading_dimension(a) result(ld) bind(C)
    real, intent(in) :: a(:,:)
    integer :: ld
    end function
end interface

print*, leading_dimension(a)
print*, leading_dimension(a(1:5,:))
print*, leading_dimension(a(1:5,1:10:2))

contains

    pure function ld(a)
    real, intent(in) :: a(:,:)
    integer :: ld
    
    ld = leading_dimension(a)
    end function

end program
#include "ISO_Fortran_binding.h"
int leading_dimension(const CFI_cdesc_t* a) {
    return a->dim[1].sm / a->elem_len;
}

Output as expected:

% icc -c ldc.c && ifort ldf.f90 ldc.o && ./a.out
          10
          10
          20

The main issues are

  • as you said, it assumes the presence of a C companion processor
  • more than that, it works only with the interoperable types

It depends… I had recently a case with a huge matrix where I had to manipulate sub-matrixes (that said, copy-in/copy-out would have been acceptable). Certainly, the leading-dimension approach in LAPACK comes from times where dynamic allocation was not available and one had fixed-size and often overdimensioned arrays. Still, there are some cases where leading dimensions different from the needed size can be desirable (e.g. for alignment purposes).

1 Like

You’re right: in my attempt to ensure that the input variable is passed by reference, I was assuming it to be intent(inout):

    !> C Interface
     interface
        type(array_descriptor) function CFI_to_Fortran(descr) bind(C,name="CFI_to_Fortran")
            import array_descriptor
            type(*), dimension(..), intent(inout) :: descr
         end function CFI_to_fortran
     end interface

intent(in) also works, and so the function can be pure.

Yep, for example on macOS, clang has no Fortran companion processor currently, hence in order to compile we need to use gcc-13. No big deal as the C ABI is universal.

Collecting your and @ivanpribec’s example, this may be standard conforming:

pure integer function leading_dim(a)
   real(real64), intent(in), target :: a(:,:)
   integer(c_intptr_t) :: a11,a21
   if (size(a,2)<=1) then 
      leading_dim = 0
   else
     a11 = transfer(c_loc(a(1,1)),a11)
     a21 = transfer(c_loc(a(1,2)),a21)
     leading_dim = 8*int(a21-a11)/storage_size(a)
   endif
end function leading_dim  

Of course it being fortran, it’s not generic.

1 Like

To be bullet-proof (?), you may replace the 8 by bit_size(0_c_char) :slight_smile: