Using coarrays in a shared library (OpenCoarrays)

hsnyder · April 22, 2021, 4:41am

Suppose I wanted to make a shared library exposing a C interface via bind(C), and use that library from a language other than Fortran, but use OpenCoarrays within the library. Does anybody know how I could accomplish this? I gather that I have to call _gfortran_caf_init() and the associated finalization routine, but would I have to execute the program that uses that library by using cafrun / mpiexec? I’d love to hear from anyone who has experience with or knowledge about this. Thanks.

shahmoradi · April 22, 2021, 5:41am

I have tried achieving this via Intel Coarray Fortran and GNU/OpenCoarrays. None were successful. I have asked the Intel team to consider adding this feature but received no positive response. The current implementations of Coarray by compilers require the main program to be also in Fortran. I would be exhilarated to hear from someone in this forum proving me wrong about this. There are also Cray and NAG Coarray compilers, but I have no experience with them.
Perhaps with more demand, Intel/OpenCoarrays would consider adding such functionality to their Coarray implementations. It is absolutely essential. Just because of this missing feature, we had to reimplement a full project from scratch via MPI (instead of the original Coarray implementation).

themos · April 22, 2021, 11:59am

I would write a main program in Fortran that then calls a c_main C routine. You can pass command line arguments in standard-conforming ways. On the C side, just rename your main() to c_main() and recompile. At some point in the C code, you call a Fortran subroutine that uses coarrays. Compile and link with the Fortran compiler, deploy normally as for a Fortran coarray program. I tried an example with NAG Fortran Compiler.

Arjen · April 22, 2021, 12:02pm

That would be my solution as well - it is independent of whatever compiler you’re using (no specific code needed) and it is a very simple solution too.

shahmoradi · April 22, 2021, 2:17pm

@themos @Arjen That seems like a viable solution for personal use. But it is likely too complex for a C user of the Fortran library who may not know any Fortran. It would also always require the user to have a Fortran compiler installed on the system. In the case of Python or similar languages, I imagine it would become even more complex.

themos · April 22, 2021, 2:28pm

It would require the user having the necessary Fortran runtimes and a linking script (the main Fortran program can be supplied in compiled form and the user does not need to be aware of it any more than they need to be aware of exactly what Fortran runtime file does what). I don’t see that this is more onerous than installing and deploying with MPI, what am I missing?

Arjen · April 22, 2021, 2:33pm

Indeed, using an MPI enabled program requires selection of various hosts as well as the proper MPI environment as that depends on the MPI version/distribution that was used to build the program. Compare that to the requirement to put your program in a subroutine with a specific name instead of a program (or a main function in C).

hsnyder · April 22, 2021, 2:37pm

Thanks everyone. In my case that’s not a viable strategy because the shared library will be opened dynamically via dlopen(), possibly closed and reloaded later, and is part of optional functionality of the overall program. But in other cases I can see it being a good strategy and I appreciate the suggestion.

I’m probably going to go with a two process + IPC solution (the actual shared library would contain function stubs that actually make remote procedure calls to another process). Complicated, but in my particular case I think it’s a better solution, and I’ll mention it in case it gives the idea to anyone else.

septc · April 22, 2021, 3:13pm

Just out of curiosity, what kind of “IPC” do you possibly use for your application? I’ve recently got to know some of the available options (e.g. mentioned in the following page) and trying to learn some more (though my progress is very slow…)
https://cyber.dabamos.de/programming/modernfortran/inter-process-communication.html

hsnyder · April 22, 2021, 3:15pm

I’ll post a complete example once I get it working (if…).

septc · April 22, 2021, 3:16pm

Oh thanks very much

certik · April 22, 2021, 8:34pm

I did not realize this is not possible with co-arrays (I haven’t used them seriously yet, but I want to).

@rouson you might be interested in this use case. That is one reason one would use MPI over co-arrays.

The good news is that I think it should be possible to fix, as this does not seem to be an intrinsic issue with the Fortran language itself, just the compilers.

everythingfunctional · April 22, 2021, 10:06pm

I think there actually might be an issue in the standard. (IIRC) the standard says that images are launched at program startup, before the start of execution. If the main program isn’t in Fortran, it may not even be compiled by a Fortran compiler, and so the code to launch the images won’t be added.

I think the only way you could conceivably fix this would be to say that on entry to any translation unit (i.e. Fortran source file), that uses any parallel features, it needs to check that images have been launched, if not launch them, and on exit if it launched them shut them back down. That sounds like a non-starter.

Alternatively, you might be able to say that code that calls Fortran must launch/execute the equivalent of images, such that the behavior has the same semantics. I’d be willing to bet that a C program executed with mpiexec would actually do this, but it’s worth testing, and then formally documenting in some way (probably in any appropriate standards).

FortranFan · April 23, 2021, 3:06am

Conceivably the approach with enhanced interoperability with C can be extended such that ISO_Fortran_binding.h defines additional methods (and types, constants, etc.) as needed to facilitate a program other than a Fortran main to work with the companion C processor to setup and launch the images suitably so that the Fortran procedures can make use of coarrays. Basically the “work” a Fortran main does at startup can be abstracted, modeled, and standardized as functions to be executed using the C companion processor.

rouson · April 23, 2021, 5:11am

@certik thanks for tagging me. @hsnyder reached out to me separately and we corresponded some. I don’t think this is likely to be a fruitful path for several fundamental reasons that I’ve outlined at length in my correspondence with @hsnyder.

At program launch, there’s more to do than to just call caf_init. I’m pretty sure each coarray has to be registered with OpenCoarrays via caf_register and any non-allocatable coarrays have to be allocated. Moreover, how will segment-ordering be enforced? If the calling program is written in a compiled language, then compiler optimizations will likely need to be disabled to prevent potential code movement across segment boundaries during optimization.

If the desire is to use the coarray parallel programming approach, it’s likely to be less trouble and less error prone to introduce coarray abstractions to the desired language. There are published examples of Coarray Python and several published Coarray C++ implementations, including for example, a Coarray C++ library that ships with the Cray C++ compiler.

rouson · April 23, 2021, 5:17am

Alternatively, if all of the performance-critical computation is being handed off to Fortran, then another approach would be for each desired function to be compiled into a self-contained executable file, complete with its own main program. Then, if you’re wanting to call it like a function, all the data that you would have turned into intent(in) arguments can instead become either command-line arguments or data read from standard input or data read from an input file (which the calling language would write before launching the Fortran code). Similarly, for the data that would otherwise be an intent(out) argument, the Fortran code instead prints it to standard output or to an output file (which the calling language would read after running the Fortran code).

For the program I/O, one might then adopt a standard file format such as JSON. On the Fortran side, the input/output could be read/written using a library such as @everythingfunctional’s jsonff.

themos · April 23, 2021, 11:20am

On POSIX systems, you can use shm_overview(7) - Linux manual page to share data between processes, which should be more efficient than using the filesystem or pipes.

shahmoradi · April 23, 2021, 4:31pm

@certik @rouson Thanks for your input. The above sentence by @rouson is indeed what happens in my application, (almost) all performance-critical functions are on the Fortran side. There is, however, frequently the need to call back the other language from Fortran for a call-back function calculation. In such cases, it seems impossible or at best not straightforward at all (from the user’s perspective) to implement the approach suggested in your response. Currently, I achieve the desired behavior via MPI:

The user launches the application (whether Python, C, …) via mpiexec with any desired number of processes
The glue non-Fortran code on each processor loads the Fortran shared library,
the Fortran shared library checks the MPI initialization, sets up the environment and communications,
the Fortran shared library calls back the C/Python call-back function passed to it and performs the assigned tasks,
the Fortran shared library finalizes the MPI (or does not, depending on the user’s preference) and returns the control to the C/Python main application.

With the above approach, the non-Fortran user does not even need to know anything about Fortran or MPI at all, other than having a compatible installation of MPI runtime libraries on their system (which can be also automated).

It would be ideal if such an approach could be adopted with Coarray, but currently, that is impossible. This severely limits the applications of Coarray to only pedagogical activities or application developments that are purely in Fortran.

FortranFan · April 23, 2021, 4:53pm

There is an increasing number of domains where a Fortran developer can know nothing about the main program other than the fact it will not be a Fortran main!

As others (and also @rouson had mentioned in a comp.lang.fortran thread years ago) have been intimating, a Fortran developer mostly resides in the library space (shared library on *UX / DLLs on Windows) which brings certain specific requirements on Fortran language evolution including with parallel programming, generic metaprogramming, object-oriented and functional programming, exception handling, other conveniences, etc.

The ground reality of how Fortran codes may get consumed in almost any environment (i.e., outside of pedagogy or limited climate modeling and certain large institutional research) is such certain use cases need to be studied closely and included prominently in language development, but unfortunately it’s so goddamn hard to influence in a timely manner those who can vote on the importance of library development using Fortran and the gaps that come about with the current standard revision.

The illustration by @shahmoradi with COARRAYs is a perfect illustration. It can now take another 20+ years to address the limitation in the base language itself and that should be an unacceptable opportunity cost, too much of scientific and technical computing code won’t be authored in Fortran on account of this.

milancurcic · April 23, 2021, 5:27pm

To be fair, only the main program and the coarray logic must be written in Fortran. Other parts of the application can be in other languages.

Topic		Replies	Views
Fortran and MPI Advocacy	32	1491	February 5, 2024
Questions from a Fortran HPC Webinar Help	30	2665	July 15, 2021
Use coarrays with a C main program Help	8	234	October 4, 2024
Learning coarrays, collective subroutines and other parallel features of Modern Fortran Help	48	2403	May 11, 2021
Coarrays: Not ready for prime time	64	6261	April 18, 2022

Using coarrays in a shared library (OpenCoarrays)

Related topics