Iso_c_binding: interface to a C function returning a string

epagone · December 30, 2020, 6:08pm

Is it possible to write an interface for a C function returning a string?

Example

char *foo(char *bar1, int baz1, char *bar2, int baz2, char *qux)

I can imagine that it is possible to write a wrapper C function returning void in this way

void wrapfoo(char *foo, char *bar1, int baz1, char *bar2, int baz2, char *qux)

and this should work

interface
  subroutine wrapfoo(foo, bar1, baz1, bar2, baz2, qux) bind(c, name="wrapfoo")
    character(kind=c_char) :: foo(*)
    character(c_char) :: bar1(*)
    int(c_long), value :: baz1
    character(c_char) :: bar2(*)
    int(c_long), value :: baz2
    character(kind=c_char) :: qux(*)
  end subroutine wrapfoo
end interface

but I would definitely like to avoid that. Any suggestions to directly interface function foo above?

ivanpribec · December 30, 2020, 6:45pm

If you don’t mind creating a copy of the C string, then you can do something similar to:

! C - function
interface
    ! char *nlopt_algorithm_name(int algorithm) 
    type(c_ptr) function nlopt_algorithm_name(algorithm) bind(c,name="nlopt_algorithm_name")
        import c_int, c_ptr
        integer(c_int), value :: algorithm
    end function
end interface

! Fortran wrapper
function algorithm_name(a) result(name)
    integer(c_int), intent(in) :: a
    character(len=:,kind=c_char), allocatable :: name
    character(len=256,kind=c_char), pointer :: buffer
    type(c_ptr) :: cstring
    cstring = nlopt_algorithm_name(a)
    call c_f_pointer(cstring,buffer)
    name = buffer(1:index(buffer,c_null_char))
end function

In this case I knew the returned string will not be longer than 256 characters by inspection of the original C code.

EDIT: This wrapper likely creates a memory leak! See my post below for a second solution.

epagone · December 30, 2020, 7:34pm

Many thanks @ivanpribec! Unfortunately, gfortran 10.2.0 segfaults

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

at line

name = buffer(1:index(buffer,c_null_char))

I wonder if it’s a bug of gfortran or if I have done something wrong. I’ll try to test ifort but I don’t know if there might be incompatibilities since I have used the gcc toolchain to compile the C library. I’ll post an update as soon as I can test ifort.

Update: also ifort 2021.1.2 20201208 segfaults. Almost certainly now I believe that I have made a mistake in adapting the example to my case. I’ll update again when I discover something relevant

ivanpribec · December 30, 2020, 7:46pm

I noticed that I was wrongly including the terminating null string too. The line should be:

name = buffer(1:index(buffer,c_null_char)-1)

epagone · December 30, 2020, 7:52pm

Thanks @ivanpribec. Unfortunately, also the version with buffer(1:index(buffer,c_null_char)-1) segfaults both with gfortran and ifort.

FortranFan · December 30, 2020, 7:52pm

For most people in scientific and technical computing who primarily aspire to bring about meaningful advances in their chosen domains rather than get into the weeds of stack and heap allocations and memory leaks and dangled pointers and so forth, functions in C, C++ whose return values are pointers are fraught.

So if it’s possible for library authors who may have more expertise in C, C++ to stay away from such return values, especially with char * and strings where the semantics in C gets further tricky, that will be rather kind of them toward the users of their libraries. And if the users are expected to be Fortranners as well, such expert library authors can use the enhanced interoperability facility introduced in Fortran 2018 to make it further easier for them. Here is a simple how-to example of this from the Intel forum.

But now if a reauthoring of the C function is not viable, then eschewing some type safety by using type(c_ptr) as the return type is an option, as shown by @ivanpribec above.

As commented by @ivanpribec re: string copy, if it is to be avoided, then employ the C library function strlen that can be readily invoked from Fortran when it is interoperating with a C companion processor:

 ..
    character(len=:,kind=c_char), pointer :: buffer
    integer(c_int) :: lenbuffer
    type(c_ptr) :: cstring
    cstring = nlopt_algorithm_name( a )
    lenbuffer = strlen( cstring )  ! Elided is the interface for this C library function
    block
          character(len=lenbuffer, kind=c_char), pointer :: s
          call c_f_pointer(cstring, s)
          buffer => s
    end block
    print *, "Fortran string: ", buffer
..

Edit: removed the ‘function … end function’ construct from the code snippet per @ivanpribec’s comment.

ivanpribec · December 30, 2020, 7:58pm

Should buffer be the result value in the function instead of name?

epagone · December 30, 2020, 8:24pm

Something weird is going on: when I use @FortranFan’s suggested version, I get the segfault at line

lenbuffer = strlen(cstring)

ivanpribec · December 30, 2020, 8:31pm

I have create a more complete example. Let’s assume a C function which returns a string representation of an integer:

// int2str.c

#include <stdio.h>
#include <stdlib.h>

char *int2str(int i) {
    
    int length = snprintf( NULL, 0, "%d", i );
    char* str = malloc( length + 1 );
    snprintf( str, length + 1, "%d", i );
    return str;
}

This function first determines the length of the the character pointer necessary, before writing the integer value i to the instance str.

The next step is defining the Fortran interfaces:

! int2str_mod.f90
module int2str_mod

  use, intrinsic :: iso_c_binding

  interface
    type(c_ptr) function c_int2str(i) bind(c,name="int2str")
      import c_int, c_ptr
      integer(c_int), value :: i
    end function
    integer(c_size_t) function c_strlen(s) bind(c,name="strlen")
      import c_size_t, c_ptr
      type(c_ptr), intent(in), value :: s
    end function
    subroutine c_free(ptr) bind(c,name="free")
      import c_ptr
      type(c_ptr), value :: ptr
    end subroutine
  end interface

As per @FortranFan’s suggestion, this time I am using the C standard library function strlen to recover the length of the string in C. This function does not count the trailing “\n” character used as the string terminator. Next we write a wrapper function in Fortran:

contains

  function int2str(i) result(str)
    integer(c_int), intent(in) :: i
    character(:,c_char), allocatable :: str
    type(c_ptr) :: cstr
    integer(c_size_t) :: n

    cstr = c_int2str(i)
    n = c_strlen(cstr)
    allocate(character(len=n,kind=c_char) :: str)
    block
      character(len=n,kind=c_char), pointer :: s
      call c_f_pointer(cstr,s)  ! Recovers a view of the C string
      str = s                   ! Copies the string contents
    end block
    call c_free(cstr)
  end function

end module

We can use the c_strlen function to correctly allocate both the return value str and the temporary buffer within the block section. After running the code in my previous reply through valgrind I was surprised to find out it created a memory leak! It looks like for the C function defined above it is necessary to free the memory explicitly. This is done by calling the void free(void * ptr) function.

A short example program:

! main.f90
program main
  use int2str_mod

  print *, int2str(11_c_int), len(int2str(11_c_int))
  print *, int2str(121_c_int), len(int2str(121_c_int))
  print *, int2str(1221_c_int), len(int2str(1221_c_int))
end program

Compiling and running the program:

$ gfortran -Wall -ggdb3 int2str.c int2str_mod.f90 main.f90
$ ./a.out
 11           2
 121           3
 1221           4

Output with valgrind:

$ valgrind --leak-check=full ./a.out
==30654== Memcheck, a memory error detector
==30654== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==30654== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==30654== Command: ./a.out
==30654== 
 11           2
 121           3
 1221           4
==30654== 
==30654== HEAP SUMMARY:
==30654==     in use at exit: 0 bytes in 0 blocks
==30654==   total heap usage: 33 allocs, 33 frees, 13,626 bytes allocated
==30654== 
==30654== All heap blocks were freed -- no leaks are possible
==30654== 
==30654== For counts of detected and suppressed errors, rerun with: -v
==30654== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Edit: Removed an unneccesary call to c_strlen.

ivanpribec · December 30, 2020, 8:56pm

On further inspection, you can also remove the unnecessary call to allocate in the Fortran int2str function.

I have tested the code going back to gfortran-5. With Intel Fortran it also appears to work.

epagone · December 30, 2020, 11:39pm

Many thanks to @ivanpribec and @FortranFan. I confirm that the suggested solution works perfectly on a test program. For some reason, the library that I’m trying to use causes a segfault and I still don’t understand why (maybe it is returning rubbish). I’ll investigate more but I guess the problem is not any more the one in the title of this thread.

epagone · December 31, 2020, 12:16pm

Just for the records, I’ve solved the problem: it was a documentation issue. The documents describing the library reported a wrong signature. Looking at the source code, I was able to write the correct interface that worked as expected.

It’s unsettling that the problem manifested with a mysterious segfault, but I guess that it’s just a consequence of dealing with C and it’s a stark reminder on the reason why I prefer so much developing code in Fortran.

FortranFan · December 31, 2020, 9:26pm

As I mentioned in my earlier post, if it’s possible to look forward instead and work with Fortran 2018 and the C library author(s) are willing to be kind to Fortranners, it’ll be better to provide the wrappers on the C side itself where author(s) of such libraries can avail themselves of all their memory management approaches cleanly. See a modification of @ivanpribec’s example where the memory management is modeled for illustration purposes using basic C malloc, memcpy, and free:

#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include "ISO_Fortran_binding.h"

char *int2str(int i) {
    
    int length = snprintf( NULL, 0, "%d", i );
    char* str = malloc( length + 1 );
    snprintf( str, length + 1, "%d", i );
    return str;
}

// Wrapper for Fortran users
void Fint2str( int i, CFI_cdesc_t *str ) {
    char *s = int2str(i);
    size_t lens = strlen(s);
    int irc = CFI_allocate(str, (CFI_index_t *)0, (CFI_index_t *)0, lens);
    memcpy(str->base_addr, s, lens);
    free(s);
}

Then the Fortran code becomes really straightforward and plays to Fortran’s major strength with de facto “smart pointer” plus assured “garbage collection” via the ALLOCATABLE attribute:

   use, intrinsic :: iso_c_binding, only : c_int, c_char

   interface
      subroutine Fint2str( i, str ) bind(C, name="Fint2str")
         import :: c_int, c_char
         integer(c_int), intent(in), value :: i
         character(kind=c_char, len=:), allocatable, intent(out) :: str
      end subroutine 
   end interface

   character(kind=c_char, len=:), allocatable :: s

   call Fint2str( 42_c_int, s )
   print *, "s: ", s, "; expected is 42"
   print *, "len(s): ", len(s), "; expected is 2"
   
end

The above should have no “memory leak” issues; it does NOT with Intel oneAPI per my testing on Windows OS:

C:\Temp>cl /c /W3 /EHsc c.c
Microsoft (R) C/C++ Optimizing Compiler Version 19.26.28806 for x64
Copyright (C) Microsoft Corporation. All rights reserved.

c.c

C:\Temp>ifort /c /standard-semantics /warn:all /stand:f18 fc.f90
Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.1.2 Build 20201208_000000
Copyright (C) 1985-2020 Intel Corporation. All rights reserved.

C:\Temp>link fc.obj c.obj /subsystem:console /out:fc.exe
Microsoft (R) Incremental Linker Version 14.26.28806.0
Copyright (C) Microsoft Corporation. All rights reserved.

C:\Temp>fc.exe
s: 42; expected is 42
len(s): 2 ; expected is 2

C:\Temp>

ivanpribec · January 1, 2021, 6:27pm

I experimented with a very similar solution, that appeared to work with the Intel Fortran compiler:

  interface
    subroutine int2str_helper(fstr,i) bind(c,name="int2str_helper")
      import c_char, c_int
      character(:,c_char), allocatable, intent(inout) :: fstr
      integer(c_int), intent(in), value :: i
    end subroutine
  end interface

But compiling with gfortran leads to a (IMO) spurious error:

$ gfortran int2str_mod.f90 -c
int2str_mod.f90:22:34:

  22 |     subroutine int2str_helper(fstr,i) bind(c,name="int2str_helper")
     |                                  1
Error: Character argument ‘fstr’ at (1) must be length 1 because procedure ‘int2str_helper’ is BIND(C)

I definitely agree with your conclusion once you do the memory management on the C side, the code in Fortran is very simple and clean. Unfortunately, so far I don’t see many C developers using the capabilities to provide “native” Fortran API’s for their libraries.

FortranFan · January 2, 2021, 2:22am

Yes, notwithstanding the availability of ISO_Fortran_binding.h and associated types and procedures with GCC toolset, gfortran has gaps.

When it comes to Section 18.3.6 Interoperability of procedures and procedure interfaces in the Fortran standard, gfortran does not conform to some of the Fortran 2018 semantics and misses out on key aspects.

une · February 6, 2021, 8:56pm

Hi @ivanpribec. It would be nice to have this functionality available as fpm package. If you agree, I can create a pull request.

ivanpribec · February 6, 2021, 10:00pm

Hi @une, feel free to use the code as desired. I would encourage you to propose a specification for the stdlib project. The workflow is described here: stdlib/WORKFLOW.md at master · fortran-lang/stdlib · GitHub

A Fortran implementation for a similar int to string function can be found here: String handling routines · Issue #69 · fortran-lang/stdlib · GitHub

Topic		Replies	Views
Best practices for passing C strings	19	6241	July 2, 2024
Experiment: dealing with a pointer to a string from C	8	2023	April 1, 2021
Best way to retrieve a string from C Help	2	482	January 30, 2023
Using the same procedure name in Fortran as interfaced C function Help	2	319	April 2, 2023
Calling C from Fortran: Passing Vector of Strings from C to Fortran Homework	6	797	April 26, 2021

Iso_c_binding: interface to a C function returning a string

Related topics