Allocate interoperability and C descriptors

PierU · January 26, 2023, 5:04pm

Let’s say one have a Fortran library with some routines that have allocatable arrays as dummy arguments, and one want to pass data in these arguments from C. Constraints: the library must not be modified, and the data must not be duplicated.

It looks to me that C descriptors are needed here… I am not used to them, and I wrote a toy example:

Fortran side:

module somemodule
    implicit none
	
    contains

    subroutine frout(a)
        double precision, allocatable, intent(in) :: a(:)
	write(*,*) lbound(a), size(a), ubound(a)
	write(*,*) a(:)
    end

end module

! assuming one doesn't want to add the C binding features in 
! the original module, a wrapper routine is needed
subroutine frout_wrap(a) bind(C)
    use, intrinsic :: iso_c_binding
    use somemodule
    real(c_double), allocatable, intent(in) :: a(:)
    call frout(a)
end subroutine

C side:

#include <stdlib.h>
#include <ISO_Fortran_binding.h>

void* frout_wrap(CFI_cdesc_t*);

int main()
{
	double* mydata = (double*)malloc(6*sizeof(double));
	for (int i=0; i<6; i++) mydata[i] = (double)i;

	CFI_cdesc_t mycfi;
	
	mycfi.base_addr = (void*)mydata;
	mycfi.elem_len  = sizeof(double);
	mycfi.attribute = CFI_attribute_allocatable;
	mycfi.rank      = 1;
	mycfi.type      = CFI_type_double;
	mycfi.dim[0].lower_bound = -2;
	mycfi.dim[0].extent = 6;
	mycfi.dim[0].sm = sizeof(double);

	frout_wrap(&mycfi);
}

It works, I get the expected output:

ifort -c cfinteropf.f90 ; icc cfinterop.c cfinteropf.o -lifcore -static && a.out
          -2           6           3
  0.000000000000000E+000   1.00000000000000        2.00000000000000     
   3.00000000000000        4.00000000000000        5.00000000000000

However, is it the right way (by hand) to build the descriptor? I’ve seen the routine CFI_establish() that seems dedicated to that, but couldn’t get a working example with it…

sblionel · January 26, 2023, 7:18pm

No, it’s not the right way. You are not supposed to modify a C descriptor directly. Worse, you’re creating a Fortran allocatable and “allocating” it without using CFI_allocate. Here’s an example I wrote - I’ll take your code and rewrite it shortly.

#include "ISO_Fortran_binding.h"
#include <memory.h>
#include <stdio.h>

extern "C" void greetings(CFI_cdesc_t * descr);

int main()
{
	int status;
	CFI_CDESC_T(0) cdesc;

	// Create our own local descriptor for an allocatable string
	status = CFI_establish((CFI_cdesc_t *)&cdesc, NULL,  CFI_attribute_allocatable,
 		                CFI_type_char, 1, 0, NULL);
	//Allocate the string to length 7
	status = CFI_allocate((CFI_cdesc_t *)&cdesc, NULL, NULL, 7);
	// Copy in 'Hello, '
	memcpy(cdesc.base_addr, "Hello, ", 7);
	// Call Fortran to append to the string and print it
	greetings((CFI_cdesc_t *)&cdesc);
	printf("Length is now %zd\n", cdesc.elem_len);
	status = CFI_deallocate((CFI_cdesc_t *)&cdesc);
}

subroutine greetings (string) bind(C)
    implicit none
    character(:), allocatable :: string
    
    string = string // 'Zurich!'
    print *, string
    end subroutine greetings

sblionel · January 26, 2023, 8:04pm

Here’s your replacement C code (didn’t touch the Fortran):

#include <stdlib.h>
#include <memory.h>
#include <ISO_Fortran_binding.h>

void* frout_wrap(CFI_cdesc_t* descr);

int main()
{
	double* mydata = (double*)malloc(6 * sizeof(double));
	int status;

	CFI_CDESC_T(1) mycfi;
	CFI_index_t lb[1], ub[1];

	for (int i = 0; i < 6; i++) mydata[i] = (double)i;

	status = CFI_establish(
		(CFI_cdesc_t*)&mycfi,
		NULL,
		CFI_attribute_allocatable,
		CFI_type_double,
		0,  // elem_size ignored
		1,  // rank
		NULL);
	lb[0] = -2; ub[0] = 3;

	status = CFI_allocate(
		(CFI_cdesc_t*)&mycfi,
		lb, ub, sizeof(double));

	memcpy(mycfi.base_addr, (void *)mydata, 6*sizeof(double));


	frout_wrap(&mycfi);
}

Output:

          -2           6           3
  0.000000000000000E+000   1.00000000000000        2.00000000000000
   3.00000000000000        4.00000000000000        5.00000000000000

sblionel · January 26, 2023, 8:17pm

Improved version:

#include <stdlib.h>
#include <ISO_Fortran_binding.h>

void* frout_wrap(CFI_cdesc_t* descr);

int main()
{
	double* mydata;
	int status;

	CFI_CDESC_T(1) mycfi;
	CFI_index_t lb[1], ub[1];


	status = CFI_establish(
		(CFI_cdesc_t*)&mycfi,
		NULL,
		CFI_attribute_allocatable,
		CFI_type_double,
		0,  // elem_size ignored
		1,  // rank
		NULL);
	lb[0] = -2; ub[0] = 3;

	status = CFI_allocate(
		(CFI_cdesc_t*)&mycfi,
		lb, ub, sizeof(double));

	mydata = CFI_address((CFI_cdesc_t *)&mycfi,lb);
	for (int i = 0; i < 6; i++) mydata[i] = (double)i;

	frout_wrap(&mycfi);
}

ivanpribec · January 26, 2023, 9:27pm

Just a minor comment, I believe it is not required that double precision and real(c_double) have the same kind, although in practice it appears to be true for all widely used Fortran processors and their companion C processors.

While both kinds are typically 64-bit floating point formats on modern processors, their definitions are at odds with each other.

For C (section 6.2.5, paragraph 10) the standard [0] says:

There are three real floating types, designated as float, double, and long double. The set of values of the type float is a subset of the set of values of the type double; the set of values of the type double is a subset of the set of values of the type long double.

Additionally, in Annex F, IEC 60559 floating-point arithmetic (section F.2, paragraph 1):

The C floating types match the IEC 60559 formats as follows:

The float type matches the IEC 60559 single format.

The double type matches the IEC 60559 double format.

The long double type matches an IEC 60559 extended format else a non-IEC 60559 extended format, else the IEC 60559 double format.

Any non-IEC 60559 extended format used for the long double type shall have more
precision than IEC 60559 double and at least the range of IEC 60559 double.

You can check for conformance with this annex via the preprocessor define:

An implementation that defines __STDC_IEC_559__ shall conform to the specifications in this annex.

In the Fortran standard [1] on the other hand (section 7.4.3.2 Real type):

If the type keyword REAL is used without a kind type parameter, the real type with default real kind is specified and the kind value is KIND (0.0). The type specifier DOUBLE PRECISION specifies type real with double precision kind; the kind value is KIND (0.0D0). The decimal precision of the double precision real approximation method shall be greater than that of the default real method.

The decimal precision of double precision real shall be at least 10, and its decimal exponent range shall be at least 37. It is recommended that the decimal precision of default real be at least 6, and that its decimal exponent range be at least 37.

In addition to this you have constraints on the types when used in a storage association context (section 19.5.3.2):

(1) a nonpointer scalar object that is default integer, default real, or default logical occupies a single numeric storage unit,
(2) a nonpointer scalar object that is double precision real or default complex occupies two contiguous numeric storage units.

To make sure your Fortran kind matches the C kind, you could define your kind specifier in the bind(c) wrapper like this:

integer, parameter :: dp = merge(c_double,-99,c_double == kind(1.0D0))

Since the kind specifier must be a non-negative integer, your Fortran processor will flag this as an error. Alternatively, you will get warned of a kind mismatch at the procedure call site.

I see this as one argument in favor of using a kind parameter (e.g. real(dp) instead of double precision) - you can fix the mismatch between your C and Fortran kinds in one line. I admit this scenario is extremely unlikely to ever occur in practice.

module somemodule
    implicit none
    integer, parameter :: dp = kind(1.0D0)
    contains

    subroutine frout(a)
        real(dp), allocatable, intent(in) :: a(:)
	write(*,*) lbound(a), size(a), ubound(a)
	write(*,*) a(:)
    end subroutine
end module

[0] C11 Draft (ISO/IEC 9899:201x) - https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
[1] F2018 Draft (ISO/IEC FDIS 1539-1) - https://www.nag.com/market/training/fortran-workshop/f2018_standard.pdf

PierU · January 26, 2023, 10:19pm

@sblionel Thanks! I was a bit lost among the Metcalf et al. C descriptor code examples…

Apart for the name mangling, does the Fortran callee routine require the bind(C) attribute in such a case ?

Next level: what if the dummy argument is a derived type with an allocatable array component? From all what I’ve read/tried, it seems impossible to pass something in the array from C… Am I correct?

module somemodule
    implicit none

    type sometype
      double precision, allocatable :: a(:)
    end type
	
    contains

    subroutine frout2(x)
        type(sometype), intent(in) :: x
	write(*,*) lbound(x%a), size(x%a), ubound(x%a)
	write(*,*) x%a(:)
    end

end module

! Won't compile because "sometype" has not the bind(C) attribute,
! and if added the compiler complains that a bind(C) type shall not
! contain an allocatable
subroutine frout2_wrap(x) bind(C)
    use, intrinsic :: iso_c_binding
    use somemodule
    type(sometype), intent(in) :: x
    call frout2(x)
end subroutine

PierU · January 26, 2023, 10:29pm

I agree. At least there would be a compilation error in the wrapper routine, should c_double and double precision be different types. Given the constraints (existing Fortran library that one doesn’t want to modify, and no data duplication) the only solution is to look for (and use) the right type -if available- on the C side (and in the wrapper).

sblionel · January 26, 2023, 10:39pm

Apart for the name mangling, does the Fortran callee routine require the bind(C) attribute in such a case ?

Yes - this tells the compiler to expect a C descriptor. ifort, for example, will use its “traditional” descriptor if you don’t do this. bind(C) also prevents passing or looking for hidden arguments.

Next level: what if the dummy argument is a derived type with an allocatable array component? From all what I’ve read/tried, it seems impossible to pass something in the array from C… Am I correct?

You are correct - derived types with pointer or allocatable components are not interoperable. You can declare a derived type with a component of type(C_PTR), but that’s obviuously not the same thing. It is what you’d have to do if you wanted to pass pointers around.

ivanpribec · January 26, 2023, 11:10pm

You can still make it work, but to get hold of the non-interoperable Fortran type you’ll need to introduce a typeless “handle” object:

module somemodule_wrap
  
  use, intrinsic :: iso_c_binding, only: &
      c_double, c_int, c_ptr, &
      c_f_pointer, c_loc
  use somemodule, only: sometype, frout2
  implicit none

contains

  type(c_ptr) function sometype_create(len) bind(c)
    integer(c_int), value :: len
    type(sometype), pointer :: p
    allocate(p)
    allocate(p%a(len))
    sometype_create = c_loc(p)
  end function

  type(c_ptr) function sometype_a(ptr, n) bind(c)
    type(c_ptr), intent(in), value :: ptr
    integer(c_int), intent(out) :: n
    type(sometype), pointer :: p => null()

    call c_f_pointer(ptr,p)
    if (allocated(p%a)) then
      n = size(p%a)
      sometype_a = c_loc(p%a)
    else
      sometype_a = c_null_ptr
    end if

  end function

  subroutine sometype_free(ptr) bind(c)
    type(c_ptr), value :: ptr
    type(sometype), pointer :: p => null()
    call c_f_pointer(ptr,p)
    if (allocated(p%a)) deallocate(p%a)
    deallocate(p)
  end subroutine

  subroutine frout2_wrap(ptr) bind(c)
    type(c_ptr), value :: ptr
    type(sometype), pointer :: p => null()
    call c_f_pointer(ptr,p)
    if (allocated(p%a)) then
      call frout2(p)
    end if
  end subroutine


end module

// main.c
//
#include "somemodule_wrap.h"

int main(void) 
{
    void *h_sometype = sometype_create( 6 );

    int n;
    double *a;

    a = sometype_a(h_sometype, &n);
    for (int i = 0; i < n; i++) {
        a[i] = (double) i;
    }

    frout2_wrap(h_sometype);
    sometype_free(h_sometype);
    
    return 0;
}

Example build process and output:

ivan:~/fortran/somemodule$ make
gfortran -Wall -fcheck=all -std=f2018 -c somemodule.f90
gfortran -fc-prototypes -fsyntax-only somemodule_wrap.f90 > somemodule_wrap.h
gfortran -Wall -fcheck=all -std=f2018 -c somemodule_wrap.f90
gcc-10 -Wall -std=c11 -o main main.c somemodule_wrap.o somemodule.o -lgfortran
ivan:~/fortran/somemodule$ ./main
           1           6           6
   0.0000000000000000        1.0000000000000000        2.0000000000000000        3.0000000000000000        4.0000000000000000        5.0000000000000000

gfortran is the only Fortran compiler I’m aware of that supports generation of the corresponding C prototypes.

Edit: here are the build files (both CMake and a custom Makefile) in case anyway is interested:

ivan:~/fortran/somemodule$ tree
.
├── CMakeLists.txt
├── Makefile
├── main.c
├── somemodule.f90
└── somemodule_wrap.f90

0 directories, 5 files

CMakeLists.txt

cmake_minimum_required(VERSION 3.12.0)

project(someproject VERSION 0.1.0 LANGUAGES C Fortran)

add_library(
    somemodule
    somemodule.f90
    somemodule_wrap.f90
)

add_custom_command(
    OUTPUT somemodule_wrap.h
    COMMAND gfortran -fc-prototypes -fsyntax-only ${CMAKE_CURRENT_SOURCE_DIR}/somemodule_wrap.f90 > ${CMAKE_CURRENT_SOURCE_DIR}/somemodule_wrap.h
    DEPENDS somemodule_wrap.f90
)

add_executable(main main.c somemodule_wrap.h)
target_link_libraries(main somemodule)

Makefile

FC=gfortran
CC=gcc-10

FCFLAGS=-Wall -fcheck=all -std=f2018
CFLAGS=-Wall -std=c11

LDFLAGS=-lgfortran

.phony: all clean

all: main

main: main.c somemodule_wrap.o somemodule.o
	$(CC) $(CFLAGS) -o $@ $^ $(LDFLAGS)

main.c: somemodule_wrap.h

somemodule_wrap.h: somemodule_wrap.f90
	gfortran -fc-prototypes -fsyntax-only $< > $@

somemodule_wrap.f90: somemodule.mod

somemodule_wrap.o somemodule_wrap.mod: somemodule_wrap.f90 somemodule.mod
	$(FC) $(FCFLAGS) -c $<

somemodule.o somemodule.mod: somemodule.f90
	$(FC) $(FCFLAGS) -c $<

clean:
	rm -rf *.o *.mod somemodule_wrap.h main

FortranFan · January 27, 2023, 12:42am

It’s not typeless, it’s of type(c_ptr) which is a specific type. Some might view it as an “opaque pointer” for it maps closest to void * in C but regardless it is not typeless in Fortran. An example of what would be typeless in Fortran is what the standard calls BOZ literals.

ivanpribec · January 27, 2023, 12:59am

Fair point. In the C source file the handle is of type pointer to void so it does have a type. In some C libraries, cuSPARSE for example, they use the adjective generic to describe an API where “the data reference does not carry the type itself (e.g. void* )”.

FortranFan · January 27, 2023, 4:16am

The improved version overlooked the CFI_deallocate.

For programs initiated by means other than a Fortran processor, which is what is shown here, it is good practice to have the CFI_deallocate on CFI_cdesc_t objects that are employed with CFI_allocate.

FortranFan · January 27, 2023, 4:18am

if ( allocated(string) ) then
   string = string // 'Zurich!'
   print *, string
else
   ! ??
end if

PierU · January 27, 2023, 7:54am

I went through a similar solution at some point, but rejected it because data duplication was needed. But I didn’t get that in some cases it would be possible to allocate on the Fortran side before loading/generating the data on the C side (so no data duplication is needed), as you show here. Thanks.

I didn’t know… nice

PierU · January 27, 2023, 10:43am

I didn’t tested until now but I have actually an issue:

mydata = (double*)CFI_address((CFI_cdesc_t *)&mycfi,lb); returns a null pointer, so I get a segmentation violation when trying to fill the array.

This is with icc/ifort 18 or 19 (don’t have a more recent version).

It works fine with gcc/gfortran 10, so I guess there is a bug in icc/ifort 18/19 (maybe it has been corrected in recent version, but I can’t check).

Also frout_wrap(&mycfi); should be frout_wrap((CFI_cdesc_t*)&mycfi);

FortranFan · January 27, 2023, 1:46pm

The advice to not manipulate struct members of objects of CFI_cdesc_t in code is good, but the case is nuanced when it comes to referencing them.

In certain situations - as also indicated by Metcalf et al. in Modern Fortran Explained - it is a reasonably convenient practice to employ them directly and move on:

mydata = (double *)mycfi.base_addr;

sblionel · January 27, 2023, 5:42pm

The original example did not do a deallocate, so there was nothing to overlook. Yes, it is good practice to deallocate things when you’re done, but everything gets deallocated at program exit so few people bother with that. I didn’t add it because it would obscure the lesson here.

sblionel · January 27, 2023, 5:44pm

Yes, I could have done that but thought it was helpful to show CFI_address. It isn’t a problem to read the descriptor, just don’t modify it without using the CFI functions.

sblionel · January 27, 2023, 5:50pm

I did test the code I posted. If I make the change you show, it still works OK for me.

As for the frout_wrap change, that looks fine. I’m not a C programmer, and saw the compiler’s grumbling about that line but wasn’t sure how to change it so I left it alone.

PierU · January 27, 2023, 8:33pm

Probably a bug that has been corrected in the latest versions of icc/ifort.

Me neither . I do a lot by trial/error in C. I finally got the point that C pointers were typed, contrary to a common belief, and that although compilers often cast them when needed, explicitely casting them was a good practice.

Topic		Replies	Views
Manipulating array descriptors from C? Help	8	498	November 15, 2023
Using a Fortran derived type with allocatable components from C	1	1328	May 15, 2022
Allocatable character array in BIND(C) dummy argument - GFortran 11 bug?	0	326	September 22, 2023
Iso_c_binding: pass an array from c to fortran (Edit: python interop content)	25	3419	June 14, 2024
Examples: iso_c_binding calling Fortran from C	6	1489	August 11, 2022

Allocate interoperability and C descriptors

Related topics