Is there an intrinsic way to transpose N-dim arrays?

milancurcic · June 16, 2022, 7:12pm

I’d like to tranpose arrays to get some data from C-style to Fortran-style ordering. For a rank-2 array I can use the intrinsic transpose. But how best to do it for higher ranks? The intrinsic reshape offers the order optional argument, so I tried, for example for rank-4:

x = reshape(x, shape=reverse(shape(x)), order=[4, 3, 2, 1])

and reverse is a function that reverses the order of elements in a rank-1 array.

However, the value of shape must be a constant expression, and I don’t know the shape of x at compile time, so I can’t use this approach.

Currently, I have:

pure function reverse_dim_order(x) result(res)
  real, intent(in) :: x(:,:,:,:)
  real, allocatable :: res(:,:,:,:)
  integer :: dims(4)
  integer :: i, j, k, l
  dims = shape(x)
  allocate(res(dims(4), dims(3), dims(2), dims(1)))
  do concurrent(i = 1:dims(1), j = 1:dims(2), k = 1:dims(3), l = 1:dims(4))
    res(l,k,j,i) = x(i,j,k,l)
  end do
end function reverse_dim_order

which works, but I’m wondering is there a more intrinsic approach to this that I could take? Performance is not critical, so I’d like to take an approach that’s as concise, intrinsic, and Fortrannic as possible.

Beliavsky · June 16, 2022, 8:08pm

I don’t think that is correct. Metcalf, Reid, and Cohen (2018) say only that

The size of shape must be constant, and its elements must not be negative.

and the following code compiles and runs:

program main
implicit none
real, allocatable :: x(:,:)
allocate (x(2,3))
call random_number(x)
print*,"shape, sum=",shape(x),sum(x)
x = reshape(x,shape=[size(x,2),size(x,1)])
print*,"shape, sum=",shape(x),sum(x)
end program main

So I think you could write separate transpose functions using reshape with an order argument that work for 3D arrays, 4D arrays, etc.

milancurcic · June 17, 2022, 1:48pm

Indeed, you’re right, I wasn’t reading the rules carefully.

I have settled with this solution. Not as elegant as I’d like it but I think it’s as intrinsic as we can get.

RonShepard · June 17, 2022, 11:45pm

Are there any optimizations that can be done for these kinds of transpose operations? For example, if instead of working on entire rows and columns (and their generalization for higher dimension arrays), does it make sense to operate on small subblocks of the arrays? The goal would be to minimize cache misses, or page faults, and such. This could be done by writing explicit loops and then strip-mining the assignment statements.

gnikit · June 18, 2022, 12:13am

Well there are always optimisations to be made. The real question is if one should bother. Unless these multi-D transpose operations take a substantial amount of time or consume an unsustainable amount of memory there really is no point.

I think this common phrase is applicable in this case “If it ain’t broke don’t fix it.”

JohnCampbell · June 18, 2022, 11:54am

Is transpose what is required to get some data from C-style to Fortran-style ordering ?
Does the DO CONCURRENT transpose x or merely reshape x ?
For xt to be the transpose of x, xt(j,i) = x(i,j).
Does reshape change the memory order of the elements of x

To go from “C style” to “Fortran style”, isn’t all that is required is to change the subscript order when referencing an element of the array or when dimensioning in Fortran or C code ?
Just reference the contiguous elements of the array correctly for the language.

RonShepard · June 18, 2022, 5:41pm

I think in both the cases with TRANSPOSE() and RESHAPE(), they are normal functions that return results in the normal way, namely a temporary array is generated and the results are placed in that temporary array. However, if you look at optimized code, or if you use C_LOC() to look at memory addresses of the array elements, you wil see that the compiler often times can avoid the memory allocation and storage step by doing something fancy with the array metadata. You can think of this as a “shallow transpose” or a “shallow reshape” kind of operation. However, at present, the programmer has no direct access to that functionality, it is just up to the compiler to recognize it and to do it when it thinks it is appropriate.

I think it would be a good idea if such operations were exposed and made available to the programmer. For example, an operation like B=>SHALLOW_TRANSPOSE(A) would return a pointer to the memory occupied by the original A, but with the array metadata adjusted so that B(i,j) and A(j,i) reference the same memory location.

I think this would be useful for array transpose, reshape, and for complex arrays the conjugate and conjugarte transpose operations. The key thing is that the operation would take O(1) time, not O(M*N) time, for an MxN array. Most optimizing compilers already do this in special cases, it would be just a matter of exposing this capability directly to the programmer.

septc · June 19, 2022, 1:33pm

I think it would be possible for compilers to create an array pointer having “opposite” index order, in principle. But if those arrays are passed to assumed-shape dummy arrays in an existing routine, I am afraid the programmer needs to take additional care of possibly different index orders for contiguous access (e.g., by calling “is_fortran_order()” etc and determining which of “Fortran” or “C” order for a particular array), and then prepare different do-loops that best match the contiguous order for a given descriptor. I guess this may be cumbersome possibly… (though the idea of transposed pointer seems useful).

RonShepard · June 19, 2022, 5:26pm

As a separate optimization step, the subroutine could well check for array strides and try to optimize based on that. As I said before, the compiler is already doing this in some cases, so maybe that part is already happening and we don’t notice it.

awvwgk · June 19, 2022, 5:58pm

It should be possible to do the transpose in three lines using reshape, automatic left-hand-side reallocation and the Fortran 2018 rank intrinsic.

associate(shape => shape(array))
  array = reshape(array, shape=shape(rank(array):1:-1), order=[(i, i=rank(array), 1, -1)])
end associate

If we could index results of a function evaluation it would be just a single line

array = reshape(array, shape=shape(array)(rank(array):1:-1), order=[(i, i=rank(array), 1, -1)])

septc · June 19, 2022, 8:32pm

associate(s => shape(array), n => rank(array))
  array = reshape(array, shape=s(n:1:-1), order=[(i, i=n,1,-1)])
end associate

It is interesting that if I try to use a short-hand symbol like n above, the compilation fails again for the same error (constant size)…

   18 |   array = reshape(array, shape=s(n:1:-1), order=[(i, i=n,1,-1)])
      |                               1
Error: 'shape' argument of 'reshape' intrinsic at (1) must be an array of constant size

milancurcic · June 19, 2022, 9:40pm

Right, the example by @awvwgk doesn’t work for the same reason that my first snippet in the top post doesn’t work, and as @Beliavsky pointed out in his response. If this had worked, the rank of the result of reshape couldn’t be determined at compile-time, and as I understand it, the constant expression requirement for the shape argument to reshape is there to ensure that.

awvwgk · June 19, 2022, 9:58pm

Both GFortran and Intel Fortran seem to compile it just fine.

❯ cat transpose.f90
implicit none
integer :: i
real, allocatable :: array(:, :, :, :)

allocate(array(5, 6, 2, 4))
call random_number(array)
print *, shape(array)

associate(shape => shape(array))
  array = reshape(array, shape=shape(rank(array):1:-1), order=[(i, i=rank(array), 1, -1)])
end associate
print *, shape(array)
end
❯ gfortran transpose.f90 && ./a.out
           5           6           2           4
           4           2           6           5
❯ ifort transpose.f90 && ./a.out
           5           6           2           4
           4           2           6           5

Just be careful which variables you use, size(shape) instead of rank(array) doesn’t work for example.

Edit: At least NAG declares it as extension

❯ nagfor transpose.f90 && ./a.out 
NAG Fortran Compiler Release 7.1(Hanzomon) Build 7111
Extension(F2018): transpose.f90, line 13: RANK intrinsic procedure
[NAG Fortran Compiler normal termination, 1 warning]
 5 6 2 4
 4 2 6 5

milancurcic · June 19, 2022, 10:11pm

Indeed, that’s what threw me off. I was mistakingly assigning rank(array) to a variable or associate name, which doesn’t work because they’re not constant. But rank(array) can be assigned to a parameter, which makes the shape argument a constant expression. Nice “trick”!

milancurcic · June 19, 2022, 10:46pm

Thinking a bit more about rank() and whether its result is a constant expression. Curiously, ifort 2021.4 doesn’t compile this:

$ cat rank.f90 
integer, allocatable :: array(:,:,:)
integer, parameter :: n = rank(array)
end
$ ifort rank.f90 
rank.f90(2): error #7747: Invalid parameter expression.   [RANK]
integer, parameter :: n = rank(array)
--------------------------^
compilation aborted for rank.f90 (code 1)

but GFortran does. Who is correct?

From my interpretation of 10.1.12 in 18-007r1, array in this case is not a constant expression, so neither is rank(array). If my interpretation is correct, then the example by @awvwgk isn’t conforming either? But I don’t see why it shouldn’t be, as rank(array) can be determined at compile-time.

FortranFan · June 20, 2022, 12:26am

For whatever it’s worth, I think the code conforms and IFORT has a bug here.

The expression that has IFORT stumble is a specification inquiry using a standard intrinsic inquiry function where the argument is a variable whose properties are not assumed, deferred, or defined by an expression that is not a constant expression. The standard permits this in a constant expression.

Fyi I’ve turned in a support request with Intel OSC on this.

FedericoPerini · June 20, 2022, 6:33pm

Performancewise, I think the combination of all those intrinsics may reduce chances for compiler optimization. Array halding is exactly about these pointer calculations, and I bet all fortran compilers are excellent at that. So I’d rather leave this thing more explicitly on the loop side, to leave the compiler unleash its optimization strategies. See this version:

program test_reshape4
    implicit none

    integer, allocatable, dimension(:,:,:,:) :: f,c
    integer :: i,j,k,l

    f = reshape([(j,j=1,2*3*4*5)],[2,3,4,5])

    allocate(c(5,4,3,2))

    forall(i=1:2,j=1:3,k=1:4,l=1:5) c(l,k,j,i) = f(i,j,k,l)

    print *, 'Fortran=',f
    print *, 'C      =',c

end program test_reshape4

and look at the amazing way gfortran optimizes this stuff by creating 4-sized chunks of pointers

JohnCampbell · June 22, 2022, 2:39am

This is a good example of using forall to “transpose” the array. This leaves any optimisation order up to the compiler, which we are told we should do.
I think this answers the OP question “Is there an intrinsic way to transpose N-dim arrays?”

However, my point is why bother.

If you are going to use array f or c, you have to use a subscript order that is compatible with the array declaration AND the order that data is stored in memory, independent of it having been created in a Fortran or C routine.
With cache management being so important for performance, we must also consider how to store the data in memory to minimise memory page usage. This applies equally to Fortran or C. This is the key answer to Why bother.

milancurcic · June 22, 2022, 2:25pm

Because one component of the system may emit the data in one order, and another component downstream expects the data in the other order.

Don’t get too hung up on C vs. Fortran order, I merely used that to illustrate what I’m trying to do, as people here are likely to understand the C/Fortran (row/column major) ordering concept. Different software frameworks or libraries emit or consume data assuming their own conventions. A software layer that connects them must follow these conventions for the system to work.

milancurcic · June 22, 2022, 2:28pm

Thanks, I like this as well, but it’s essentially the same as the do concurrent approach in the top post, plus an obsolescent feature. As performance is not critical for me, @awvwgk’s approach still remains my favorite because it doesn’t require hardcoding the shape; except for I’m not yet 100% sure it’s standard-conforming.

Topic		Replies	Views
Array permutation or n-d transpose	3	487	December 13, 2023
Rank a compile time concept vs a run time concept? Help	13	515	May 28, 2024
Do functions check size/rank of input arrays? If not why? Help	14	706	October 27, 2022
Matrix index-pointer confusion Help	21	357	August 11, 2024
Shallow_transfer and ISO_Fortran_binding	2	488	September 5, 2023

Is there an intrinsic way to transpose N-dim arrays?

Related topics