One downside of the interface that accepts a scalar function is that due to the cross-language calls, you are unlikely to benefit from inlining and/or vectorization. Maybe inter-procedural optimization can help, but we’d need to verify this to to be sure.
The other way is to modify the callback, and make it the duty of the callback to evaluate all elements:
// apply_vfun.c
#include <stdio.h>
// evaluate y_i = f(x_i), for i = 0, n-1
typedef void (*vfun)(int,
const double *restrict,
double *restrict);
void apply_vfun(vfun f, int n, const double *x, double *y)
{
printf("applying vector fun for n = %d\n", n);
f(n, x, y);
}
Here is IMO a good reason why the dimension(n) syntax is not meaningless – it allows the use of array expressions in the Fortran callback, which are amenable to auto-vectorization. If array syntax is not enough, then either an elemental function or an explicit loop with !$omp simd will do the trick.
! demo.f90
program demo
use, intrinsic :: iso_c_binding
implicit none
abstract interface
subroutine vfun(n,x,y) bind(c)
import c_int, c_double
implicit none
integer(c_int), value :: n
real(c_double), intent(in) :: x(n)
real(c_double), intent(out) :: y(n)
end subroutine
end interface
interface
subroutine apply_vfun(f,n,x,y) bind(c)
import c_int, c_double, vfun
implicit none
procedure(vfun) :: f
integer(c_int), value :: n
real(c_double), intent(in) :: x(n)
real(c_double), intent(out) :: y(n)
end subroutine
end interface
integer, parameter :: n = 32
real(c_double) :: x(n), y(n)
integer :: i
call random_number(x)
! y := f(x)
call apply_vfun(my_vfun,n,x,y)
write(*,'(A)') "Results:"
write(*,'(*(G0,2X,G0,/))') [(x(i),y(i),i=1,n)]
contains
! Callback passed to C
subroutine my_vfun(n,x,y) bind(c)
integer(c_int), value :: n
real(c_double), intent(in) :: x(n) ! Note the x(n) syntax, ...
real(c_double), intent(out) :: y(n)
! ...that allows us to use array expression.
y = sin(x) * x**2
! if x(*) were used, then we'd need to use the
! explicit x(1:n) syntax instead
end subroutine
end program
If you were to look at this in Compiler Explorer, and compiler options like -O3 -march=skylake, my_vfun contains the hot loop,
.L9:
vmovupd ymm0, YMMWORD PTR [r14]
add r14, 32
add r15, 32
vmulpd ymm2, ymm0, ymm0
vmovapd YMMWORD PTR [rbp-80], ymm2
call _ZGVdN4v_sin
vmulpd ymm0, ymm0, YMMWORD PTR [rbp-80]
vmovupd YMMWORD PTR [r15-32], ymm0
cmp r14, rbx
jne .L9
that uses the 256-bit AVX registers (we evaluate 4 elements at the same time), including a vector implementation of the sine function.
This is one the reasons I like Fortran.