Calling C++ from Fortran

C++ 11 has a rich collection of random number generators for various distributions. They can be accessed from Fortran if functions calling them are declared extern "C". For example for

normal_array.cpp

// https://people.duke.edu/~ccc14/sta-663-2018/notebooks/S12_CPP_Annotated.html

#include <random>
#include <functional> // needed for bind
extern "C" void normal_vec(int seed, int n, double x[]);

using std::default_random_engine,std::normal_distribution, std::bind;

void normal_vec(int seed, int n, double x[]) {
	// start random number engine with fixed seed
	default_random_engine re{seed};
	normal_distribution<double> norm(0,1); // mean and standard deviation
	auto rnorm = bind(norm, re);
	for (int i = 0; i<n; ++i) x[i] = rnorm();
}

and xnormal.f90

program random_normal
use iso_c_binding
implicit none
!
interface
subroutine normal_vec(seed, n, x) bind(c)
import c_int, c_double
integer(kind=c_int), intent(in), value :: seed, n
real(kind=c_double)                    :: x(n)
end subroutine normal_vec
end interface
!
integer :: seed
integer, parameter :: n = 5
real(kind=c_double) :: x(n)
!
do seed=1,3
   call normal_vec(seed,n,x)
   print "(*(1x,f7.4))",x
end do
end program random_normal

compiling and running on WSL2 with

rm -f a.out
g++ -c normal_array.cpp
gfortran -c xnormal.f90
gfortran normal_array.o xnormal.o
./a.out

(using ifort or nvfortran also works) gives

 -0.1220 -1.0868  0.6843 -1.0752  0.0333
  0.3526 -0.2002 -1.9825 -0.8651  0.9774
 -1.6147 -1.3718 -0.2794  0.0946  0.2605

Is it generally the case that C++ functions can be called from Fortran if they are declared extern “C” and they have C-like arguments?

3 Likes

I think the answer is yes if the interface is interoperable. You might occasionally encounter a namespace issue if you are trying to wrap a C++ function in an extern C function The other issue is do you use Fortran or C++ to do the final linking. If I remember correctly, ifort already knows about libc so it can link in C functions without you specifying -l libc. It doesn’t know about libc++ etc though so you have to add those to the link statement yourself. I doubt that the C++ compiler knows (or cares) anything about libfortran etc. Not sure about what gfortran needs.

2 Likes

extern "C" is used to declare a function or struct follows C linkage conventions. Essentially it ensures the names (symbols) are unmangled just like bind(C) in Fortran.

Concerning arguments, there are a few (if not more) caveats:

  • C and C++ have different definitions of bool, see Issues interfacing between C++ and Fortran - #15 by ivanpribec. In practice, if you have compatible C and C++ compilers, they boil to the same integer type. There’s an example of interfacing Fortran/C/C++ bools in @scivision’s GitHub repository: https://github.com/scivision/fortran-c-cpp-interface/blob/main/src/cxx/bool_main.cxx
  • You cannot call C++ overloaded functions, these are functions with the same name but different types or number of arguments (in Fortran this is called a generic interface). You will have to write wrappers with distinct names instead, e.g.
    // C++ code:
    void f(int);
    void f(double);
    extern "C" void f_i(int i) { f(i); }
    extern "C" void f_d(double d) { f(d); }
    
    You can then re-establish a generic interface on the Fortran side.

I’d also recommended going through the following resources:

A few more practical tips for creating wrappers:

  • Think in terms of Fortran interfaces. C++ libraries are often designed around objects that encapsulate state but are rarely interoperable with C directly. When writing a Fortran/C interface, prefer writing subroutines/functions which perform an “action” associated with those objects which however remain hidden in the C/C++ portion of the wrapper. The random number generator example fits under this pattern; instead of writing wrappers to expose the random number engine and distributions, those are kept hidden on the C++ side and only the “action” of filling an array with normally distributed values is exposed. (There are usage cases where this might not be possible, e.g. if the C++ object has non-trivial initialization or you want to keep just a single reference to an object on the Fortran side. This becomes much more complex as you essentially take control over the object’s lifetime.)
  • Use [] to indicate arrays in prototypes. A C function which accepts an array can be declared in two ways
    void f(int arr[]); // or
    void f(int *arr);
    
    The second form however is also used for pointers to scalars. Technically it makes no difference, but it helps as a visual reminder that arr is supposed to be an array, and not a reference to a scalar. This is also known as “array decay”, i.e. the array arr[] decays into a pointer to it’s first element.
  • Pass constant scalars by value. It’s easier to add the value attribute in Fortran than it is to de-reference a pointer at each point of usage in C.
  • Use const (C) and intent(in) (Fortran) to match intentions. This applies particularly to variables passed by reference. Be careful not to cast const away; read “const does not mean constant” and “Const and Optimization in C”.
  • Use containers such as std::span as type- and bound-safe views into (contiguous) Fortran arrays. Such containers “elevate” a decayed pointer into a STL-compatible container. Other libraries may provide similar domain-specific containers; two examples are the Eigen::Map<> and arma::Mat<> matrix classes, which can be used as wrappers for dense matrices.
8 Likes

Thanks. I have been looking at your example of Fortran calling C++ stdlib sort routines.

Regarding your suggestion to use [] rather than * for arrays in prototypes,

gfortran -fc-prototypes -c sum_vec.f90

on

module m
use iso_c_binding
implicit none
contains
subroutine sum_vec(n,x,xsum) bind(c)
integer(kind=c_int)  , intent(in)  :: n
real   (kind=c_float), intent(in)  :: x(n)
real   (kind=c_float), intent(out) :: xsum
xsum = sum(x)
end subroutine sum_vec
end module m

generates a prototype

void sum_vec (const int *n, const float *x, float *xsum);

but this can be adjusted manually to

void sum_vec (const int *n, const float x[], float *xsum);

Yes, I noticed that too. In Fortran there is a strong distinction between array and scalar variables, so I believe compiler could be modified to follow “my” rule. It’s a small distinction I’ve been thinking about lately. I still need to think about how it fits in with multi-dimensional arrays.

My C and C++ knowledge is meager, and I have not figured out how to pass 2D arrays from Fortran to C when the shape is not known at compile time. For

function sum_mat(nr,nc,x) result(xsum) bind(c)
integer(kind=c_int)  , intent(in)  :: nr,nc
real   (kind=c_float), intent(in)  :: x(nr,nc)
real   (kind=c_float)              :: xsum
xsum = sum(x)
end function sum_mat

gfortran gives a prototype

float sum_mat (const int *nr, const int *nc, const float *x);

but in the C function I think you must refer to x as a 1D array and cannot refer to elements as
x[i][j].

I believe that

float sum_mat (const int *nr, const float x[][N]);

is legal in C where N is a constant but not

float sum_mat (const int *nr, const int *nc, const float x[][]);

I have found that you need to link executables with a C++ compiler. Otherwise, C++ exceptions are not caught, and a program crashes.

If there are no C++ exceptions then I think it is fine to link with a fortran compiler. Not sure though.

Additionally considerable care is often needed to ensure many other aspects of C++ do not trip up the extern “C” functions.

C++, as most everyone will now know, is not “C with classes”, it is a different language in its own right that is gigantic, complex, and advancing further most rapidly.

But things that have long been around such as C++ exceptions will be foreign to the Fortran processor and programs can encounter undesired behavior:

Caveat emptor with C++ exceptions
#include <stdexcept>
using namespace std;

extern "C" void sub(int);

void sub(int n)
{
    if (n < 0)
        throw std::invalid_argument("Dummy argument must be >= 0.");
    return;
}
  • Fortran caller:
   use, intrinsic :: iso_c_binding, only : c_int
   interface
      subroutine sub( n ) bind(C, name="sub")
         import :: c_int
         integer(c_int), intent(in), value :: n
      end subroutine
   end interface
   integer(c_int) :: a
   a = -1
   call sub( a )
end
C:\temp>gfortran -c p.f90

C:\temp>gfortran p.o f.o -o p.exe -lstdc++

C:\temp>p.exe
terminate called after throwing an instance of 'std::invalid_argument'
  what():  Dummy argument must be >= 0.

Program received signal SIGABRT: Process abort signal.

Backtrace for this error:
#0  0x44f456
#1  0x4474d7
#2  0x40e4adfa
#3  0x40e4f1fa
#4  0x433315
#5  0x431801
#6  0x442e94
#7  0x443d27
#8  0x40160a
#9  0x401568
#10  0x4015ab
#11  0x4013c0
#12  0x4014f5
#13  0x409f7033
#14  0x41222650
#15  0xffffffff

Is this another tweet?

2+ years into Fortran stdlib and the Fortran ecosystem reboot, it might be useful to introduce mental filters:

  • if users have specific needs such as other random number generators, work with Fortran stdlib Community to get them added there and advocate the use of stdlib,
  • if the purpose is to illustrate mixed-language programming with C++ and Fortran with C as a bridge, use minimal examples that do not advise or prescribe any particular use case but only show the interoperability of generic objects and methods.

As things stand with the Fortran language and its standard, assumed-size and explicit-shape array dummy argument semantics can be misleading and/or error-prone and the standard has long veered toward assumed-shape arrays as a preferred approach starting Fortran 90.

Under the circumstances, the use of C descriptors from Fortran 2018 in C prototypes can be another consideration, to some extent at least it signals the C prototype is operating with something foreign, not a mere pointer or an array masquerading as a pointer.

Yes, it was tweeted. It’s true that Fortran stdlib has

but C++ has a wider range of RNG, the product of many programmer hours, which I linked to in my original message. Since the equivalent of the C++ STL will not be in the Fortran standard and implemented in Fortran compilers for perhaps a decade or more, I think it’s also worth illustrating how the C++ STL, for example sorting algorithms, can be called efficiently from Fortran.

@Beliavsky , the following compiles with no errors with gcc and icc.

extern float sum_mat(const int nr, const int nc, const float x[nc][nr]);

float sum_mat_c(const int nr, const int nc, const float x[nc][nr])
{
 return sum_mat(nr, nc,x);
}


You should also change the nr and nc arguments in your Fortran function to pass by value ie.

function sum_mat(nr,nc,x) result(xsum) bind(c)
integer(kind=c_int)  , value, intent(in)  :: nr,nc
real   (kind=c_float),        intent(in)  :: x(nr,nc)
real   (kind=c_float)              :: xsum
xsum = sum(x)
end function sum_mat
1 Like

Working towards “standardization” in Fortran stdlib is certainly a worthy goal. However, given that many of the needs can be met with libraries in C++, personally I think we should make heavier use of them. Most of the time I either lack the skills or can’t justify the time needed to write Fortran implementation for every possible programming need.

Wrapping a C++ library on the other hand offers several advantages:

  • typically, only a small amount of wrapper code is needed, much less than what would be needed to write an implementation from scratch
  • the C++ libraries are tried and tested by decades of use and thousands of programmers,
  • most of the Fortran compilers are offered along compatible C and C++ compilers,
  • using C/C++ as a shortcut, we can cut the development time ultimately enabling quicker progress towards “Fortranic” interfaces for our needs

From a more pragmatic point of view:

  • the routines are to be used as black-boxes anyways; I’ve never seen a complaint that the intrinsic math functions or random_number subroutine are not implemented in Fortran.

Fortran stdlib is both a specification and a reference implementation (preferably in Fortran). As long as the behavior matches the specification, anyone can choose to offer an implementation in C/C++ or any other language for that matter.

That said, there are also a few negatives that should be weighed in the process:

  • using two languages can complicate the build process
  • using multiple languages might lower the opportunities for in-lining or inter-procedural optimization
  • using C or C++, could be considered detrimental for Fortran compilers which do not support interoperability, or have companion compilers
  • not using Fortran might lead to a bias that it is a language which is unsuitable for certain kinds of algorithms

I agree that assumed-shape arrays should be the preferred approach. However, while much more powerful, C descriptors are also much more complex, especially if you start to play with allocatable and pointer variables. My long-term wish is to have a safe C++ container which would hide the “gruesome” internals of the Fortran C descriptor type. I have a hunch that the curiously recurring template pattern could be used to implement a form of static polymorphism, offering different member methods for different kinds of arrays.

2 Likes

Thanks. Now I can define some C functions with matrix arguments and use them from Fortran.

The C code

sum_mat.c
#include "sum_mat.h"
#include <stdio.h>

void print_mat(const int nr, const int nc, const float x[nr][nc])
{
	for (int ir=0; ir<nr; ir++) {
		for (int ic=0; ic<nc; ic++) {
			printf(" %f",x[ir][ic]);
		}
		printf("\n");
	}
}

void print_mat_transpose(const int nr, const int nc, const float x[nr][nc])
{
	for (int ic=0; ic<nc; ic++) {
		for (int ir=0; ir<nr; ir++) {
			printf(" %f",x[ir][ic]);
		}
		printf("\n");
	}
}

float sum_mat(const int nr, const int nc, const float x[nr][nc])
{
	float xsum = 0.0;
	for (int ir=0; ir<nr; ir++)
		for (int ic=0; ic<nc; ic++)
			xsum += x[ir][ic];
	return xsum;
}

void sum_rows(const int nr, const int nc, const float x[nr][nc], float sums[nr])
{
	for (int ir=0; ir<nr; ir++) {
		sums[ir] = 0.0;
		for (int ic=0; ic<nc; ic++) {
			sums[ir] += x[ir][ic];
		}
	}
}

void scale_mat(float xscale, const int nr, const int nc, float x[nr][nc])
{
	for (int ir=0; ir<nr; ir++) {
		for (int ic=0; ic<nc; ic++) {
			x[ir][ic] *= xscale;
		}
	}
}

with header file

void print_mat(const int nr, const int nc, const float x[nr][nc]);
void print_mat_transpose(const int nr, const int nc, const float x[nr][nc]);
float sum_mat(const int nr, const int nc, const float x[nr][nc]);
void sum_rows(const int nr, const int nc, const float x[nr][nc], float sums[nr]);
void scale_mat(float xscale, const int nr, const int nc, float x[nr][nc]);

can be called from a Fortran code xxmat.f90

program xxmat
use iso_c_binding, only: c_int, c_float
implicit none
integer, parameter :: nr = 2, nc = 3
real(kind=c_float) :: x(nr,nc),row_sums(nr),col_sums(nc)
integer            :: ir,ic
interface
subroutine print_mat(nrow_c, ncol_c, x) bind(c,name="print_mat_transpose")
import c_int, c_float
integer(kind=c_int), intent(in), value :: nrow_c, ncol_c
real(kind=c_float) , intent(in)        :: x(ncol_c,nrow_c)
end subroutine print_mat
!
function sum_mat(nrow_c,ncol_c,x) result(xsum) bind(c)
import c_int, c_float
integer(kind=c_int), intent(in), value :: nrow_c, ncol_c
real(kind=c_float) , intent(in)        :: x(ncol_c,nrow_c)
end function sum_mat
!
subroutine sum_cols(nrow_c,ncol_c,x,col_sums) bind(c,name="sum_rows")
import c_int, c_float
integer(kind=c_int), intent(in), value :: nrow_c, ncol_c
real(kind=c_float) , intent(in)        :: x(ncol_c,nrow_c)
real(kind=c_float) , intent(out)       :: col_sums(nrow_c)
end subroutine sum_cols
!
subroutine scale_mat(xscale, nrow_c, ncol_c, x) bind(c,name="scale_mat")
import c_int, c_float
real(kind=c_float) , intent(in), value :: xscale
integer(kind=c_int), intent(in), value :: nrow_c, ncol_c
real(kind=c_float) , intent(in)        :: x(ncol_c,nrow_c)
end subroutine scale_mat
!
end interface
print*,"matrix"
do ir=1,nr
   do ic=1,nc
      x(ir,ic) = real(10*ir + ic, kind = c_float)
   end do
   print "(*(1x,f5.1))",x(ir,:)
end do
print "(/,a)", "calling print_mat"
call print_mat(nrow_c=nc, ncol_c=nr, x=x)
print*
print*,"sum_mat(nr,nc,x),sum(x)=",sum_mat(nr,nc,x),sum(x)
call sum_cols(nrow_c=nc, ncol_c=nr, x=x, col_sums=col_sums)
print*,"col_sums=",col_sums
print*,"sum(x,dim=1) =",sum(x,dim=1)
call scale_mat(xscale=10.0, nrow_c=nc, ncol_c=nr, x=x)
print "(/,a)","matrix after scaling:"
call print_mat(nrow_c=nc, ncol_c=nr, x=x)
end program xxmat

Compiling with

gcc -c sum_mat.c
gfortran sum_mat.o xxmat.f90

gives output

matrix
  11.0  12.0  13.0
  21.0  22.0  23.0

calling print_mat
 11.000000 12.000000 13.000000
 21.000000 22.000000 23.000000

 sum_mat(nr,nc,x),sum(x)=   102.000000       102.000000    
 col_sums=   32.0000000       34.0000000       36.0000000    
 sum(x,dim=1) =   32.0000000       34.0000000       36.0000000    

matrix after scaling:
 110.000000 120.000000 130.000000
 210.000000 220.000000 230.000000

The only tricky part was that real :: x(n1,n2) corresponds to float x[n2][n1], which I often stumbled over. A C function to compute row sums corresponds to a Fortran function to compute column sums, as the interface for sum_cols illustrates, and the argument names in that interface remind me what is being done. By defining a wrapper with assumed shape argument in a module, such as

module mat_mod
use iso_c_binding, only: c_int, c_float
implicit none
interface
subroutine sum_cols_base(nrow_c,ncol_c,x,col_sums) bind(c,name="sum_rows")
import c_int, c_float
integer(kind=c_int), intent(in), value :: nrow_c, ncol_c
real(kind=c_float) , intent(in)        :: x(ncol_c,nrow_c)
real(kind=c_float) , intent(out)       :: col_sums(nrow_c)
end subroutine sum_cols_base
end interface
contains
!
function sum_cols(x) result(col_sums)
real(kind=c_float), intent(in) :: x(:,:)
real(kind=c_float)             :: col_sums(size(x,2))
call sum_cols_base(size(x,2),size(x,1),x,col_sums)
end function sum_cols
end module mat_mod

you can define a procedure that can be accessed without having to remember how Fortran and C matrix ordering differs, for example

program xxmat
use iso_c_binding, only: c_float
use mat_mod, only: sum_cols
implicit none
integer, parameter :: nr = 2, nc = 3
real(kind=c_float) :: x(nr,nc)
integer            :: ir,ic
print*,"matrix"
do ir=1,nr
   do ic=1,nc
      x(ir,ic) = real(10*ir + ic, kind = c_float)
   end do
   print "(*(1x,f5.1))",x(ir,:)
end do
print "(a,*(1x,f5.1))","column sums =",sum_cols(x)
end program xxmat
! output:
!  matrix
!   11.0  12.0  13.0
!   21.0  22.0  23.0
! column sums =  32.0  34.0  36.0

The C code shown cannot be compiled with g++. I see how to pass matrices from Fortran to C, but what about C++?

@Beliavsky, for C++ you have to place the C functions in an extern “C” block or statement.

The following works on most compilers I’ve tried

#ifdef __cplusplus
extern "C" {
#endif

put your C functions here

#ifdef __cplusplus
}
#endif

the __cplusplus macro is intrinsic to all the C++ compilers I’ve tried. This structure allow you to compile the same code for both C and C++. Also, if your C functions are wrappers around a C++ function in a particular namespace the namespace should contain the extern “C” block (not sure if this is allways true but I know just enough C++ to be “dangerous”).

Also, as mentioned previously, you might have to use C++ to link the final executable.

Here the situation with C is better considered separately from C++.

With C, the Fortran standard starting Fortran 2003 (about 18 years go) started providing a way to recognize the companion processor and defined the aspects toward interoperability. And practically, the conforming Fortran processors now strive to also interoperate with a companion C processor.

But with C++ and anything other language, they might as well not exist from a Fortran standard-conforming processor standpoint.

Thus when it comes to working with C++, it is back to pre-Fortran 2003 days of mixed-language programming, pretty much processor-dependent territory. It is better then for coders to stick to programs that are initiated with C++ processors (meaning no Fortran main program) and strive to interoperate such programs with the C processor per C++ semantics (extern “C” is part of this) and constrain Fortran to such a strict box toward C. On Windows OS in simple terms, this would mean Fortran code is in DLLs as much as feasible.

1 Like

I’ve identified a problem with the program in the OP, that demonstrates some of the issues mentioned by @FortranFan and @nicholaswogan.

First I edited the program to call normal_vec multiple times with the same seed:

program random_normal

use iso_c_binding
implicit none

interface
   subroutine normal_vec(seed, n, x) bind(c)
      import c_int, c_double
      integer(kind=c_int), intent(in), value :: seed, n
      real(kind=c_double)                    :: x(n)
   end subroutine normal_vec
end interface

integer :: seed
integer, parameter :: n = 5
real(kind=c_double) :: x(n)

seed = 42

call normal_vec(seed,n,x)
print "(*(1x,f7.4))",x

call normal_vec(seed,n,x)
print "(*(1x,f7.4))",x

call normal_vec(seed,n,x)
print "(*(1x,f7.4))",x

end program random_normal

To my surprise, the numbers would just repeat themselves:

$ ./a.out
 -1.7141  0.1781  0.0572 -1.4098  0.7563
 -1.7141  0.1781  0.0572 -1.4098  0.7563
 -1.7141  0.1781  0.0572 -1.4098  0.7563

What went wrong? At each call the random engine object is being initialized from the start of the random number sequence corresponding to seed. The fix is to add the static property to both the engine and distribution objects:

#include <random>
#include <span>  /* C++ 20 */

extern "C" void normal_vec(int seed, int n, double xarr[]);

void normal_vec(int seed, int n, double xarr[]) {
    
    // because engine and distributions retain state, they should be
    // defined as static so that new numbers are generated on each call
    static std::default_random_engine re{static_cast<size_t>(seed)};
    static std::normal_distribution<double> norm{0,1};

    std::span<double> _x{xarr,static_cast<size_t>(n)};

    for(double &x : _x) {
        x = norm(re);
    }
}

The rules for storage duration and linkage for static variables in block scope (including in the local scope of a function) are explained here: Storage class specifiers - cppreference.com

The engine and distribution objects are now initialized only the first time the function is called. I’m assuming that one object will be initialized for each new seed, but I was not able to confirm this. The destructors are called at program exit. From my understanding this will not happen if your main program is in Fortran, unless you call std::exit explicitly, meaning we’ve probably created a memory leak.

Once you add the static attribute, using gfortran as the linker will fail. Instead the C++ linker needs to be used, to handle the static storage creation, initialization, and destruction. To be even more secure, one would also need to make sure all the constructors and C++ functions being called are no-except, or make sure to try and catch every exception.

So indeed, inter-operating C++ and Fortran requires much more care that it may seem at first.

This leaves me wondering, whether using C++ from a Fortran co-array program is feasible at all, and if yes, under what restrictions (no exceptions, no classes, namespaces, or functions with static storage, …)?

1 Like

I would not call it a problem, although it may not have the functionality desired by some users. If you want to do multiple simulations with the same stream of random numbers, and you don’t want to store them, then it is convenient to have calls with the same seed produce the same set of variates. That is how the random_number of Fortran works:

program main
implicit none
integer :: i,nseeds,iter
integer, allocatable :: seeds(:)
real :: xran
call random_seed(size=nseeds)
print*,"nseeds =",nseeds
seeds = 1000*[(i,i=1,nseeds)]
do iter=1,3
   call random_seed(put=seeds)
   call random_number(xran)
   print*,"xran =",xran
end do
end program main

gfortran output on Windows:

 nseeds =           8
 xran =  0.476731658    
 xran =  0.476731658    
 xran =  0.476731658

ifort output on Windows:

 nseeds =           2
 xran =  0.9807355    
 xran =  0.9807355    
 xran =  0.9807355

Is it not an option to make a wrapper C++ class that contains RNG things and use the pointer to it (opaque handle) from the Fortran side? (If that pointer is deleted manually, the destructor also takes care of the RNG things…?)

Also, I’m wondering if it is possible to catch exception in a general manner on the C++ side, and if necessary, return related messages to the Fortran side?

The crucial difference is you have separate subroutines for seeding the RNG and retrieving variates.