First post, nice forum! I am in “over my head” : … working on a data processor with a simple/linear workflow – efficiency is key (terabytes) and the workflow should be easily configurable (swapping blocks, or re-configuring them.) The workflow configuration itself should be done by some glue code. Each processing sub-element could be OOfortran procedures running parallel code and consuming data and classes preferably instanced from the glue code. Assuming the glue code is in python (with which I am not familiar with) how do I pass classes around? I have looked at gfort2py, and I like what I see, but it still lacks the the ability to wrap and pass classes as opaque object. What is next: SWIG? (ugh!) Thank you in advance!
Hi @thierry welcome to the forum! Currently the most robust way that I know of is to use the
iso_c_binding and then, e.g., Cython to wrap it into Python. That means that you have to pass the Fortran OO via the C interface.
It’s a good idea to be able to automatically wrap any Fortran structures such as OO. I’ve added this to a TODO list for LFortran: https://gitlab.com/lfortran/lfortran/-/issues/133
This is common problem in genomics processing. I think you are overthinking the problem. Run the code on many processors, have each image read in part of the data, swap blocks and reconfigure among the images as needed, and then let each image process its part independently. No OO, classes, “glue code”, or Python. Just simple, straightforward Fortran.
Welcome to the forum.
Given what you write in the original post, it may help to keep a few things in mind:
Please do note development using modern Fortran and in conjunction with other languages (e.g., Python or C++ providing glue code) is still in its early stages in the sense the processors (compilers) have only recently been coming on board with their support of interoperability features with C, especially with the enhanced facilities in Fortran 2018 standard,
Tooling techniques with autogenerated code for interfaces and glues toward Fortran code with other drivers are few and far in between and those out there need to be tested further and documented better and scripted more finely for confident usage.
Under the circumstances, if you are looking for something reliable and which you and your teams can understand and consume productively, you may want to be ready to do some “heavy-lifting” yourself by writing your own wrappers. And also be ready for some verbosity and perhaps some duplication of effort. Note instructions need to be spelt out explicitly on the Fortran side and unlike C++, there simply aren’t a combination of terse Morse code like syntax to achieve a lot of tasks with a few symbols!
Additionally, you may want to be open to the idea of your Fortran procedures toward your work tasks (data processor?) to be primarily PURE/ELEMENTAL SUBROUTINE and/or FUNCTION subprograms which are also interoperable with a companion C processor i.e., via BIND(C, …) clause, as mentioned by @certik upthread.
Then if the callers are primarily Python, C++, etc., you may want to focus on OO design in your glue code instead of pursuing OO in Fortran. You are likely to find this more efficient to use the Fortran procedures.
Where the above-mentioned duplication may come in is if there are Fortran callers also for the procedures and you wish to provide similarly looking OO-based APIs for such callers. In that case, a duplicate but “thin” Fortran wrapper using a derived type with type-bound procedures analogous to your glue code in Python, etc. will make sense.
If you can eventually share the details of your program(s) online, it will likely be a great test case for upcoming tooling facilities in LFortran, etc.
Thank you for all the advice. There are many valuable insights, not all of them I fully understand, I am sure. Over-designing may be counterproductive as @billlong proposed. As a consequence, one of the options I am considering is to have each block be a separate fortran main which reads and writes all data plus intermediate processing parameters (by serialization) to disk/scratch. The data-flow then consists of scripts and/or makefiles. In the previous (still in operation) processor – which has become nearly unmaintainable – everything is passed through shared memory, with very little encapsulation, and tracing dependencies is very difficult since everything nearly occupies a single namespace. With a minimum data chunk size that can be anywhere between 10Mb and 10Gb, writing --repeatedly-- at each block/executable is a lot of IO, … but not a show stopper. @certik In the interop case, using iso_c_binding (then cython for python,) is the most straightforward way – in spite of its limitations – and that was the direction of the first refactor cycle, but I am now trying to get away from that since maintaining separate version of the object instantiation/flattening on each side of the language barrier is duplicate code (@FortranFan) and dangerous. It is also has become clear that python is unavoidable for most developers . For all these reasons, implementing the workflow configuration in python and passing fortran object as opaque handles seems a really attractive solution, and it maintains a shared memory design: lfortran, fffi ?.. . @FortranFan Implementing high level interfaces for the block in the glue code and doing the heavy lifting through fortran and iso_c_binding is the current re-design goal, with its code duplication problems. There are other possibilities which might be worth considering, such as embedding python (lua, aotus) for configuration management as can be seen in some fortran projects on github … Unfortunately we are not open source yet … related project: ISCE2.