That’s a substantial limitation though, at least in my experience.
If there really is a problem with the standard, then remote-procedure-call solutions may be the best way to go in the meantime. I’m going to put some thought into what tooling could be produced to make dealing with that less painful.
It seems to me that since you mention dlopen(3), you can also intercept normal program startup and load and run whatever code a Fortran main program would cause to be run to initialize the environment. But that needs inside information from your compiler vendor. This could still be a quality-of-implementation issue rather than a shortcoming of the Standard.
Thanks @rouson for the feedback. As others alluded to, I believe Fortran simply must interoperate well with other languages, one way or another. As @themos suggested, I also believe a compiler of your choice simply has to export enough functionality in (let’s say) C++ that you call at initialization, and it takes care of registering co-arrays and whatever else is needed for successful execution.
@rouson also mentioned a complication in regard to segment ordering… I haven’t read the standard so I’m not 100% sure what’s being referred to here - is the issue that an optimizing compiler might move statements across a sync invocation? If that’s correct, then an API that completely enclosed all parallel behaviour would be able to fix that… e.g. you expose an API that does something in parallel in the background and then returns, but the calling program can’t interact with the images in any other way…
Overall, I still think the best approach for now would be to use a separate program, and some sort of IPC solution. I’ve started prototyping something like this:
The fortran “library” is a standalone program with it’s own main function. The main function itself is an infinite loop that does the following (on image 1 only)
Reads a single byte from stdin
Looks that value up in a table that determines what procedure to call
Calls the indicated procedure
Writes a byte to stdout to indicate that execution is complete.
All actual DATA communication would be done via shared memory between the calling program and the Fortran program. On POSIX, both programs would open named shared memory segments to pass arguments and return values. If we assume that only image 1 is the recipient of the shared memory, and then use co_broadcast to get the values out to other images, we could even support off-node parallelization this way…
The best way to expose this functionality to a program that wants to consume it would be to have a “wrapper” library, probably written in C since basically everything can call C. That wrapper library would have an init/deinit function, plus the actual library routines of interest - exposed as normal functions. The library init function would use something like popen in order to cafrun/mpiexec the fortran program and hold pipes to the process’s stdin and stdout. Then, the individual function wrappers would do whatever they need to do in order to get data to and from the shared memory segments and write the appropriate bytes to the stdin/stdout pipes.
This sounds like an abomination of complexity, but:
To actually use the fortran library in a client application, a programmer just needs to make normal C function calls, and,
The wrapper functions could be automatically generated by a suitable tool, making it pretty easy on the Fortran programmer too.
Perhaps LFortran would be a good place to implement that sort of automatic client wrapper generation - I think the ASR has all the necessary information to do so.
Of course this is just a high level overview - there would be significant details to work out re: error reporting, what if the remote program crashes, etc… I’ll post some code in a couple of days that will hopefully clarify this if the above is confusing.
Not being able to use coarrays in a generic library that can be callable from other languages (namely Python) is a HUGE limitation. It effectively makes coarrays useless for a good chunk of modern use cases that involve Fortran at this point. I wish there was a way to use them that “just works” somehow.
FWIW, Arkouda (a distributed numpy-like library based on Chapel) seems to be using some communication method between Jupyter notebooks and HPC servers (e.g. in page 3 of this slide).
But if coarrays are built on top of MPI (e.g. in the case of gfortran), it might be possible to directly modify mpi4py (for example) to enable coarrays also somehow…?
Btw, I’ve also come across a bit old question about similar topic:
Note that “program startup” happens in start.c (start.o) before the main program is called. That is where the images are set up, based on information from the PMI library that, in turn, accesses the program launching mechanism (srun, for example). The number of images is specified on the srun command. (srun is the launcher for SLURM. Other launch mechanisms have similar commands.)
That is a long story with so many discussions here and in other places and forums. The bottom line is, there is nothing wrong with using Coarrays in shared libraries from the standard perspective. But compilers and OpenCoarrays do not have this implemented (yet), and based on the discussions, there does not seem to be any interest in making it possible, at least for now. I think the NAG compiler’s implementation of Coarrays utilizes shared memory and could perhaps even be OpenMP underneath, just a guess. There are NAG developers in this forum and I hope they shed more light on the status and features of NAG Coarray implementation. Cray also has Coarray Fortran implemented, but I have never used it. The advantage of Coarray is that it has much simpler, more concise, and elegant syntax compared to any other parallelism paradigm. But at the moment, you are bound to only Fortran with Coarrays (not so much interoperation with other languages possible when there is Coarray, unless the entire parallelism communications occur within Fortran and the main program is also Fortran).
I will note that I am very much interested in making this possible. Right now I am focusing on compiling non-coarray code, but with @rouson, @hsnyder and others we have started figuring out how to get coarrays working and I am sure there is a way to get them working in shared libraries also.
This may be a silly question. Fortran seems to lack people working on open-source compilers. Why start over with LFortran? GFortran seems pretty good and is widely used, why not implement new ideas on GFortran?
You can. I encourage you to do that. Besides GFortran there is also Flang that you can consider contributing to.
The reason I decided to start over had a few reasons:
Interactive usage – need to relax the parser and semantics to handle it
Performance: in order to compile as quickly as possible, one has to design the internal data structures very carefully.
Quicker to compile: LFortran compiles in about 15s to 30s on all my laptops from scratch. Very important from a development point of view.
Design: I personally really like LFortran’s design, we have spent a lot of time on it and I am very happy with it.
These are the main reasons, but it’s too early to do such comparisons until LFortran (and Flang) can compile as many codes as GFortran can. However, by the time we can compile all the codes, it is quite late to drastically change the internal compiler design.
Having more than one open source compiler helps ensuring that Fortran codes are truly multiplatform, compiler independent.
I encourage you and others to contribute to all the open source compilers that you like. GFortran being the most mature of them all.
I will note that I am very much interested in making this possible. Right now I am focusing on compiling non-coarray code, but with @rouson, @hsnyder and others we have started figuring out how to get coarrays working and I am sure there is a way to get them working in shared libraries also.
The project hasn’t been announced yet, so I won’t get into the details, but yes we are working on a project that will make it possible to use coarrays in shared libraries, at least on a single node. Hopefully in the distributed (multi-node) case as well, but I cannot confirm that at this stage.
@certik Thanks for your reply. Can you sent me a example or more information? I am trying to understand how a Coarray Fortran DLL can be possibly called from C#, but it always get error.