Question about elemental procedures

RonShepard · June 5, 2026, 11:14pm

I have a question about elemental procedures. I’m not sure if this should be categorized as Help or as Language Enhancement. It could be either, or both.

Look at this code:

program elem
   implicit none

   integer :: a = 0

   integer :: b0, b1(-1:1), b2(2,2), b3(2,2,1)

   call wanted ( a, b0 )
   !call wanted ( a, b1 )
   !call wanted ( a, b2 )
   !call wanted ( a, b3 )

   call sub( a, b0 )
   print *, 'a=', a, 'b0=', b0

   call sub( a, b1 )
   print *, 'a=', a, 'b1=', b1

   call sub( a, b2 )
   print *, 'a=', a, 'b2=', b2
   
   call sub( a, b3 )
   print *, 'a=', a, 'b3=', b3

   call assumed_size(b3)

contains

   elemental subroutine wanted( a, b )
      integer, intent(inout) :: a
      integer, intent(out)   :: b
      a = a + 1
      b = a
      return
   end subroutine wanted
   
   subroutine assumed_size(b)
      integer :: b(*)
      call sub( a, b )
      return
   end subroutine assumed_size
   
   subroutine sub( a, b )
      integer, intent(inout) :: a
      integer, intent(out)   :: b(..)

      integer :: i, j

      select rank (b)
      rank (0)
         print *, 'scalar b'
         a = a + 1
         b = a
      rank (1)
         print *, 'rank 1 b'   
         do i = 1, size(b)
            a = a + 1
            b(i) = a
         enddo
      rank (2)
         print *, 'rank 2 b'
         do j = 1, size(b,dim=2)
            do i = 1, size(b,dim=1)
               a = a + 1
               b(i,j) = a
            enddo
         enddo
      rank (*)
         print *, 'assumed size is unsupported'
      rank default
         print *,  'unsupported rank'
      end select

      return
   end subroutine sub
end program elem

What I wanted was to write a single elemental subroutine wanted(), and then call it with the first argument always being a scalar and the second argument being a scalar or an array of arbitrary rank. The first call to wanted() in the above code is alright. However, the following three calls, which are commented out, are not allowed. If you uncomment one of them and try to compile, you will get an error like:

$ gfortran elem.f90 && a.out
elem.f90:9:17:

    9 |    call wanted ( a, b1 )
      |                 1
Error: Actual argument at (1) for INTENT(INOUT) dummy 'a' of ELEMENTAL subroutine 'wanted' is a scalar, but another actual argument is an array

This is not a gfortran bug, it is a program error, and gfortran correctly identifies the problem. Namely, only intent(in) scalar arguments are allowed to conform to arrays. In my actual application, the first argument gets changed in a complicated way; in this little example code it is just a counter that gets incremented.

In this case, what I want to happen is for the compiler to loop over the array elements, in array order, and call the elemental subroutine wanted() repeatedly (or the “as if” equivalent of that operation). So my first question is why was that limitation imposed by the language. Why not allow intent(inout) scalar arguments to conform to arrays?

Ok, so that then leads to the question of how to work around that limitation. One approach to that would be subroutine sub() in the above listing. This uses the assumed rank feature to allow the second argument to be any rank. The programmer must then account for each rank manually. I’ve done that for rank 0, 1, and 2 arguments just to show that it works. It is also possible to catch the assumed size case and treat it specially; in the above code I just write a message saying it is unsupported.

As you can see within that subroutine, what I’m doing is just looping over the array elements, in array order, and doing repeatedly what the scalar code does. So this works, but if I really wanted to support all possible ranks, up to the 15 allowed by the fortran standard, or up to 31 for the intel compiler, this would be very tedious, and it would be easy to make a typo which might result in a rare error that would be difficult to detect.

One could also write separate subroutines for each rank, and then make a generic interface for all of them. That is even more tedious and error prone, so no real advantage to that approach. It does have the advantage that unsupported ranks are detected at compile time rather than run time, so maybe that is something important.

So after my first question of why doesn’t the compiler do that for me (it would never make any of those types of mistakes, right?), my second question is whether there is a better way to handle this type of programming situation. Is there some simple and robust way to loop over all the elements of an array of arbitrary rank? Or is this clunky manual way with select rank the only option?

Is it worth trying to change the fortran standard in this situation? How often does this limitation arise for programmers? Is there some kind of ambiguity associated with the generalization of the current elemental feature?

urbanjost · June 6, 2026, 1:43am

Allowing this was a de-facto standard behavior pre-f90 but never part of the standard so you generally can do this with a compiler option to allow legacy behavior, but thisis discouraged with new code.

A little more standard way is to use sequence association. It is generally acceptable to do this with legacy code but is probably frowned upon when writing new code as well.

This issue is often probably the second hardest common issue to eliminate
when modernizing old code after storage being used to hold different types
either by passing arrays with a type mismatch or via equivalancing and
common blocks.

You can also create a flattened copy of the arrays and pass the temporary
and then store it back into the original, which can be lot of overhead but
uses standard methods.

This is such a common need and is assumed to work in so much legacy code
I wish it had made it into the standard, but I know of no clean
intuitive way to do what you want which is not fighting the standard,
although if you have enough of it to make it worth it there are some
tricks using the ISO_C_STANDARD interface to let you do this more efficiently
without having to enable legacy behavior.

For most cases you do not have to do that many dimensions, so I would
generally recommend the standard methods, as most (certainly not all
though) codes use four dimensions or less, or do just a few multi-dimensional
sizes so in practice they are reasonable methods.

That being said …

You can use sequence association. Place the "wanted" procedure in a 
seperate file without an interface, like you often would with F77 code.

Declare the dummy argument dimension with an asterisk (*).

Pass the number of elements as a new argument.

Pass the first element of the array or array section explicitly (known as sequence association). 

You might have to add a compiler-specific option depending on how much the
compiler wants to prevent you from using sequence association.

   gfortran -fallow-argument-mismatch src/* app/main.f90

So wanted.f90 would look like

   subroutine wanted( a, b, n )
      integer, intent(inout) :: a
      integer, intent(out)   :: b(*)
      a = a + 1
      b(:n) = a
      return
   end subroutine wanted

And you would use size(array) as the actual argument for n for arrays, and 1 for scalars.

As enhancement request, good luck. It would get wide support from the user community but it has come up many times
particularly because virtually all pre-f90 compilers supported sequence association and it never gets much farther
than the argument it was never explicitly allowed in the standard.

urbanjost · June 6, 2026, 2:05am

Newest version of gfortran appears to detect the mismatch even if compiled separately without a switch to allow mismatches, earlier versions used to not complain as long as the subroutine was compiled separately.

program arbitrary
implicit none
integer :: a
integer :: b0, b1(-1:1), b2(2,2), b3(2,2,1)

   write(*,*)'WANTED:'
   a=0
   call wanted ( a, b0 ,1)
   print *, 'a=', a, 'b0=', b0
   call wanted ( a, b1 ,size(b1))
   print *, 'a=', a, 'b1=', b1
   call wanted ( a, b2 ,size(b2))
   print *, 'a=', a, 'b2=', b2
   call wanted ( a, b3 ,size(b3))
   print *, 'a=', a, 'b3=', b3

end program arbitrary

subroutine wanted( a, b, n )
integer, intent(inout) :: a
integer, intent(out)   :: b(*)
   a = a + 1
   b(:n) = a
end subroutine wanted

$ gfortran arbitrary.f90 -fallow-argument-mismatch -o arbitrary
arbitrary.f90:8:20:

    8 |    call wanted ( a, b0 ,1)
      |                    1
Warning: Rank mismatch in argument ‘b’ at (1) (rank-1 and scalar)

./arbitrary
 WANTED:
 a=           1 b0=           1
 a=           2 b1=           2           2           2
 a=           3 b2=           3           3           3           3
 a=           4 b3=           4           4           4           4

urbanjost · June 6, 2026, 2:54am

If asking for an enhancement, something like how imaginary number components can be passed as INOUT variables instead of values migh t be good, as in “var%pack” that would act something like cmpx%im and cmpx%re. No matter what VAR was (even scalar) it would be passed as a single-dimension array to the called procedure. Probably would require var to be contiguous I guess.

Breaking News(sort of)

Feeling patient? Well, using GNU Fortran (GCC) 16.0.0 20250727
(experimental) and the -std=f202y switch this actually ran, although
not sure it will when the next standard comes out or not.

So caveat emptor, but I think your wish might be coming. I can think
of a lot of uses for that.

gfortran -std=f202y  point.f90

program point
implicit none
integer :: a
integer :: b0, b1(-1:1), b2(2,2), b3(2,2,1)

   a=0
   call wanted ( a, b0 )
   print *, 'a=', a, 'b0=', b0
   call wanted ( a, b1 )
   print *, 'a=', a, 'b1=', b1
   call wanted ( a, b2 )
   print *, 'a=', a, 'b2=', b2
   call wanted ( a, b3 )
   print *, 'a=', a, 'b3=', b3

contains

subroutine wanted( a, b)
! This technique is known as pointer rank remapping (introduced in Fortran 2003 and expanded in Fortran 2008).
! requires the multi-dimensional target array is simply contiguous.
integer, intent(inout)                  :: a
integer,target, contiguous, intent(out) :: b(..)
integer                                 :: n
integer,pointer                         :: p_b(:)
   n=size(b)
   a = a + 1
   p_b(1:n)=>b  ! will this be allowed outside of SELECT RANK ?
   ! Explicit Bounds: You must specify the explicit upper and lower bounds
   ! on the left-hand side of the pointer assignment.
   p_b = a
end subroutine wanted

end program point

 a=           1 b0=           1
 a=           2 b1=           2           2           2
 a=           3 b2=           3           3           3           3
 a=           4 b3=           4           4           4           4

Maybe a standard solution?

Not sure how many rules I am breaking, but worked with GNU Fortran (GCC)
16.0.0 20250727 (experimental) and is as close as I could get using
standard (I think) code. If it holds up I have several places I can use
flatten(3f) myself. Your question inspired me to beat this one into the
ground as far as I could and it is looking promising from my perspective.

program elem
implicit none
integer :: a
integer :: b0, b1(-1:1), b2(2,2), b3(2,2,1)

   a=0
   call wanted ( a, b0 )
   print *, 'a=', a, 'b0=', b0
   call wanted ( a, b1 )
   print *, 'a=', a, 'b1=', b1
   call wanted ( a, b2 )
   print *, 'a=', a, 'b2=', b2
   call wanted ( a, b3 )
   print *, 'a=', a, 'b3=', b3

contains

function flatten(arr) result(p_arr)
! generically point to a flattened array given a scalar or contiguous array of rank 1 to 15
use, intrinsic :: iso_c_binding
integer,target, contiguous :: arr(..)
integer,pointer            :: p_arr(:)
integer                    :: n
   n=size(arr)
   ! Note explicit bounds are required: You must specify the explicit upper and lower bounds
   ! on the left-hand side of the pointer assignment.
   select rank (arr)                                      
   rank (0);     ! p_arr(1:1)=>arr  ! not allowed up to f2023 at least
      ! Map the scalar's memory address to a 1D array pointer of size 1
      call c_f_pointer(c_loc(arr), p_arr, [1])
      ! unsafe in general, but I think is OK since using a SELECT RANK to ensure a scalar
   rank (1);     p_arr(1:n)=>arr                            
   rank (2);     p_arr(1:n)=>arr                            
   rank (3);     p_arr(1:n)=>arr                            
   rank (4);     p_arr(1:n)=>arr                            
   rank (5);     p_arr(1:n)=>arr                            
   rank (6);     p_arr(1:n)=>arr                            
   rank (7);     p_arr(1:n)=>arr                            
   rank (8);     p_arr(1:n)=>arr                            
   rank (9);     p_arr(1:n)=>arr                            
   rank (10);    p_arr(1:n)=>arr                            
   rank (11);    p_arr(1:n)=>arr                            
   rank (12);    p_arr(1:n)=>arr                            
   rank (13);    p_arr(1:n)=>arr                            
   rank (14);    p_arr(1:n)=>arr                            
   rank (15);    p_arr(1:n)=>arr                            
   rank (*)                                                 
   print *, 'assumed size is unsupported'  
   rank default                                             
   print *, 'unsupported rank'                    
   end select                                              
end function flatten

subroutine wanted( a, b)
! This technique is known as pointer rank remapping (introduced in Fortran 2003 and expanded in Fortran 2008).
! requires the multi-dimensional target array is simply contiguous.
integer, intent(inout)                  :: a
integer,target, contiguous, intent(out) :: b(..)
integer                                 :: n
integer,pointer                         :: p_b(:)
   n=size(b)
   a = a + 1
   p_b=>flatten(b)
   p_b = a
end subroutine wanted

end program elem

Project is up to date
 a=           1 b0=           1
 a=           2 b1=           2           2           2
 a=           3 b2=           3           3           3           3
 a=           4 b3=           4           4           4           4

Hiding the pointer

Alternatively, to avoid using pointers directly
write the called routine to expect a flattened
array and call the argument with flatten().

Assuming the flatten() procedure above has been
placed in a module, the following works so far:

program elem
use M_flatten, only : flatten
implicit none
integer :: a
integer :: b0, b1(-1:1), b2(2,2), b3(2,2,1)

   a=0
   call wanted ( a, flatten(b0) )
   print *, 'a=', a, 'b0=', b0
   call wanted ( a, flatten(b1) )
   print *, 'a=', a, 'b1=', b1
   call wanted ( a, flatten(b2) )
   print *, 'a=', a, 'b2=', b2
   call wanted ( a, flatten(b3) )
   print *, 'a=', a, 'b3=', b3

contains

subroutine wanted( a, b)
integer, intent(inout) :: a
integer, intent(out)   :: b(:)
   a = a + 1
   b = a
end subroutine wanted

end program elem

That looks to be the easiest when modernizing old code that assumed a
scalar could be passed to an array or an array of arbitrary rank could be passed to an array of a different rank.

Elemental INTENT

I think only allowing for input values to be INTENT(IN) is because the goal is to allow for parallel execution. So you want all the calls to be able to occur in any order.

RonShepard · June 6, 2026, 4:16pm

Yes, I thought about that, but I did not mention it in my original post. The idea in practice is that if the actual argument is not contiguous, then copy-in/copy-out argument association will occur. As you say, that is a lot of overhead, and I do expect my actual application to have noncontiguous arguments a good fraction of the time, so I was looking for other possible solutions to this problem.

Note that I want to increment the counter before each scalar assignment, not just before each array assignment, so the code in this case would look something like:

subroutine sub_array( a, b, n)
   integer, intent(in)    :: n
   integer, intent(inout) :: a
   integer, intent(out)   :: b(n)
   integer :: i
   do i = 1, n
      a = a + 1
      b(i) = a
   enddo
   return
end subroutine sub_array

The scalar version would need to be separate to be standard conforming, but as you point out, this would eliminate the need for 15 (or 31) separate cases for each array rank, which is really the main issue.

I think this is probably correct. However, the compiler can see that the argument is intent(inout), so it would know when parallel execution is allowed and when it isn’t. Also, the elemental attribute already requires evaluation in array-element order for intent(in), which is exactly what I would want for this intent(inout) case too. Further, with the impure attribute, which is required in my actual application code, parallel execution would not be allowed anyway; I did not have that attribute in my posted code, and I probably should have done so because of this implementation detail.

Regarding copy-in/copy-out argument association in general, I did not test that for the assumed rank code in my original post. After adding that test, it does appear that c_loc() before the call is the same as c_loc() within sub(), so no copy-in/copy-out is triggered for the gfortran and flang compilers just because of the assumed rank dummy argument. Here is that code:

program elem
   implicit none

   integer :: a = 0

   integer :: b0=-1, b1(-1:1)=-1, b2(2,2)=-1, b3(2,2,2)=-1

   call wanted ( a, b0 )
   !call wanted ( a, b1 )
   !call wanted ( a, b2 )
   !call wanted ( a, b3 )

   call sub( a, b0 )
   print *, 'a=', a, 'b0=', b0

   call sub( a, b1 )
   print *, 'a=', a, 'b1=', b1

   call sub( a, b2 )
   print *, 'a=', a, 'b2=', b2
   
   call sub( a, b3 )
   print *, 'a=', a, 'b3=', b3

   print *, 'rank 3 b3:'
   call printloc( b3(2,1,1) )  ! first element of the argument.
   call sub( a, b3(2,:,:) )  ! noncontiguous slice.
   print *, 'a=', a, 'b3=', b3

   call assumed_size(b3)

contains

   subroutine printloc( a )
      use, intrinsic :: iso_fortran_env, only: int64
      use, intrinsic :: iso_c_binding, only: c_loc
      implicit none
      integer, intent(in), target :: a
      print *, 'c_loc=', transfer( c_loc(a), 1_int64 )
      return
   end subroutine printloc

   impure elemental subroutine wanted( a, b )
      integer, intent(inout) :: a
      integer, intent(out)   :: b
      a = a + 1
      b = a
      return
   end subroutine wanted
   
   subroutine assumed_size(b)
      integer :: b(*)
      call sub( a, b )
      return
   end subroutine assumed_size
   
   subroutine sub( a, b )
      integer, intent(inout) :: a
      integer, intent(out)   :: b(..)

      integer :: i, j

      select rank (b)
      rank (0)
         print *, 'scalar b'
         a = a + 1
         b = a
      rank (1)
         print *, 'rank 1 b'   
         do i = 1, size(b)
            a = a + 1
            b(i) = a
         enddo
      rank (2)
      print *, 'rank 2 b'
      call printloc( b(1,1) )  ! first element.
         do j = 1, size(b,dim=2)
            do i = 1, size(b,dim=1)
               a = a + 1
               b(i,j) = a
            enddo
         enddo
      rank (*)
         print *, 'assumed size is unsupported'
      rank default
         print *,  'unsupported rank'
      end select

      return
   end subroutine sub
end program elem

It produces the output:

$ flang elem.f90 && a.out
 scalar b
 a= 2 b0= 2
 rank 1 b
 a= 5 b1= 3 4 5
 rank 2 b
 c_loc= 4309483536
 a= 9 b2= 6 7 8 9
 unsupported rank
 a= 9 b3= -1 -1 -1 -1 -1 -1 -1 -1
 rank 3 b3:
 c_loc= 4309483556
 rank 2 b
 c_loc= 4309483556
 a= 13 b3= -1 10 -1 11 -1 12 -1 13
 assumed size is unsupported

urbanjost · June 6, 2026, 5:15pm

So with the latest standard moving the FLATTEN() procedure into
WANTED() the least cumbersome form seems to be the following.
Surprised non-contiguous is not causing a problem

program elem
   implicit none
   integer :: a = 0
   integer :: b0=-1, b1(-1:1)=-1, b2(2,2)=-1, b3(2,2,2)=-1

   print *, 'scalar:'
   call wanted( a, b0 )
   print *, 'a=', a, 'b0=', b0

   print *, 'multi-dimensional:'
   call wanted( a, b1 )
   print *, 'a=', a, 'b1=', b1

   call wanted( a, b2 )
   print *, 'a=', a, 'b2=', b2

   call wanted( a, b3 )
   print *, 'a=', a, 'b3=', b3

   print *, 'non-contigouous:'
   call wanted( a, b3(2,:,:) )  ! noncontiguous slice
   print *, 'a=', a, 'b3=', b3

contains

subroutine wanted( a, b )
   use, intrinsic :: iso_c_binding
   use, intrinsic :: iso_fortran_env, only: int64
   integer, intent(inout)       :: a
   integer, target, contiguous, intent(out) :: b(..)
   integer, pointer             :: p_b(:)
   integer                      :: i, n
   n=size(b)
   select rank (b)
   rank (0);  call c_f_pointer(c_loc(b), p_b, [1]) ! Map the scalar's memory address to a 1D array pointer of size 1
   rank (1);  p_b(1:n)=>b
   rank (2);  p_b(1:n)=>b
   rank (3);  p_b(1:n)=>b
   rank (4);  p_b(1:n)=>b
   rank (5);  p_b(1:n)=>b
   rank (6);  p_b(1:n)=>b
   rank (7);  p_b(1:n)=>b
   rank (8);  p_b(1:n)=>b
   rank (9);  p_b(1:n)=>b
   rank (10); p_b(1:n)=>b
   rank (11); p_b(1:n)=>b
   rank (12); p_b(1:n)=>b
   rank (13); p_b(1:n)=>b
   rank (14); p_b(1:n)=>b
   rank (15); p_b(1:n)=>b
   rank (*)
   print *, 'assumed size is unsupported'
   rank default
   print *,  'unsupported rank'
   end select
   do i = 1, size(p_b)
      a = a + 1
      p_b(i) = a
   enddo
end subroutine wanted

end program elem

Output

 scalar:
 a=           1 b0=           1
 multi-dimensional:
 a=           4 b1=           2           3           4
 a=           8 b2=           5           6           7           8
 a=          16 b3=           9          10          11          12          13          14          15          16
 non-contigouous:
 a=          20 b3=           9          17          11          18          13          19          15          20

RonShepard · June 6, 2026, 8:03pm

The b3(2,:,:) argument just happens to have all equally spaced elements, namely the odd members in the array sequence. If you change that to b3(:,2,:), then it should modify elements 3,4,7,8 in the sequence. But running the code with that change gives

scalar:
 a=           1 b0=           1
 multi-dimensional:
 a=           4 b1=           2           3           4
 a=           8 b2=           5           6           7           8
 a=          16 b3=           9          10          11          12          13          14          15          16
 non-contigouous:
 a=          20 b3=           9          10          17          18          19          20          15          16

Array sequence elements 3,4,5,6 are modified instead. The compiler did not complain, so I guess it is the programmer’s responsibility to make sure the pointer assignment is valid.

Adding the contiguous attribute to the dummy argument corrects the array storage layout problem, giving correct results, but of course a copy-in/copy-out argument association is being done by the compiler behind the scenes.

jwmwalrus · June 6, 2026, 9:24pm

I’m not sure if my analysis is flawed, but…

The compiler only has to know the interface (without having to analyze that you want to do a reduce-like operation in an element-wise fashion).

So when ELEMENTAL is involved, and there’s a scalar/array mix, it’s safe to assume a spread step is involved for scalars. The problem is that intent(inout) would imply an “unspread/back-to-scalar” step, but there’s no guarantee that all the elements from the temporary array to be “unspread” are the same… so which element should the “back-to-scalar” step pick?

And even if there’s no straight parallel behavior, because of the IMPURE prefix, some palallel or out-of-order behavior may take place (i.e., the standard doesn’t care about compiler directives, but the compiler does).

RonShepard · June 6, 2026, 9:48pm

There are several possible ways that an elemental procedure could be implemented. This is one way. Another would be for the compiler to loop over the elements of the array (or arrays) and call the scalar code in array element order. There are probably other ways too, such as putting the loop inside the subroutine rather than outside. The only thing specified by the standard is that the execution is performed in array element order, the other possibilities are left to the compiler. I guess the intent(in) restriction is intended to allow that first kind of implementation. It seems like the other kinds of implementations would also allow intent(inout) arrays and scalars to be freely mixed. It seems like the pure elemental combination could be done in parallel, whereas the impure elemental combination would need to be done in serial in order to satisfy the array element sequence requirement.

urbanjost · June 6, 2026, 11:18pm

I have a need for something similar although not quite as general so I put together a github repository that works with fpm called M_flatten that also includes several of the examples discussed here for anyone who wants to tinker (It is of course a WIP).

septc · June 7, 2026, 1:16am

I have little experience with elemental (so not very sure about its restrictions), but isn’t it possibly because a scalar derived-type argument could be passed? (Then, the operation of the elemental procedure may modify the contents of the derived-type variable, so invalidating the assumption of independent operations. Possibly important for parallel execution…?) Also, is it an option to make a an array, apply an elemental procedure, and reduce it (eg, to get the sum of a)? (though this requires another array and might have some potential slowdown)

EDIT: I am sorry, parallel things seem to have already been mentioned above… (and even for scalar primitive types like real, “data race” can occur for parallel execution, so derived types may be not essential here…)

Topic		Replies	Views
Array element actual argument & assumed size dummy Help	17	490	May 1, 2025
Scalar/array mismatch in subroutine call - solvable via Fortran 2003 enhanced array constructor? Help	44	2140	August 24, 2023
Explicit-shape and assumed-size arrays, and sequence association	12	2541	February 16, 2022
Simple elemental sequence association?	4	770	May 7, 2022
A need for a proposal for an enforced INTENT(IN)? Language enhancement	14	1119	January 7, 2022