Types from Fortran to Python via Opaque Pointers

Thought this might be of interest to some people here.

There’s nothing on the Fortran / C side which hasn’t been discussed here already, but the PyCapsule approach is new (AFAIK).

10 Likes

All the C code is gibberish to me, but the basic idea of using pointers to access complex Fortran types from Python is the same idea as I showed in my 2018 paper: Application of Modern Fortran to Spacecraft Trajectory Design and Optimization. See Fig. 20. It’s a good technique, and you don’t need f2py (or swig or anything extra) if you just write the wrappers yourself.

1 Like

Thanks for the reference, will keep it in mind for my next post :slight_smile: . Generally the idea of using pointers for interfaces is very very old, back to swig and f2py as well, since (e.g.) f2py generates wrappers to interact with modules and functions (i.e. it “flattens” them to subroutines).

Personally though, I never have time to write or maintain hand-crafted wrappers (across operating systems), sof2py support is crucial for my libraries (like this Gauss-Jacobi Quadrature code) and this is the general attitude of most people at the intersection of Python, C++ and Fortran I’d guess, so the challenges are to:

  • Generate bindings with as little user interaction as possible.
  • Interface to native Python classes instead of the (pretty awkward) ctypes library (which generally requires a layer on-top to be palpable).

The importance of these points cannot be overstated, the whole scipy debacle is in part because no one wants to write / maintain an interface to Python-C.

Also I’ve not benchmarked it yet, but I suspect the overhead of ctypes is more than that of direct Python-C-Fortran bindings.

Some other historical examples include:

  • M. G. Gray, R. M. Roberts, and T. M. Evans, “Shadow-object interface between Fortran 95 and C++,” Computing in Science Engineering, vol. 1, no. 2, pp. 63–70, Mar. 1999, doi: 10.1109/5992.753048.
  • R. Bader, “A Fortran binding for the GNU scientific library,” SIGPLAN Fortran Forum, vol. 26, no. 2, pp. 4–11, Aug. 2007, doi: 10.1145/1279941.1279942.
  • A. Pletzer, D. McCune, S. Muszala, S. Vadlamani, and S. Kruger, “Exposing Fortran Derived Types to C and Other Languages,” Computing in Science Engineering, vol. 10, no. 4, pp. 86–92, Jul. 2008, doi: 10.1109/MCSE.2008.94.
  • V. K. Decyk, “A method for passing data between C and Opaque Fortran 90 pointers,” SIGPLAN Fortran Forum, vol. 27, no. 2, pp. 2–7, Aug. 2008, doi: 10.1145/1408643.1408644.

And a bunch of others, though none discuss the binding to Python, which is less trivial than one might expect (and only handled in an automated, maintained manner by f2py).

1 Like

I will point out that f2py is not the only automated set of bindings out ther. My library GitHub - rjfarmer/gfort2py: Library to allow calling fortran code from python automates the creation of the bindings between fortran and python. As long as your code is in a module then you need to make no changes to your code.

3 Likes

Thanks, that is a nice intiative. Some thoughts (just from a quick look)

  • I discussed and discarded dictionaries as an interface (wasn’t considered pythonic enough) at one point but the more the merrier.
  • Have you tried compiling SciPy’s fortran code / the F2PY test suite?
  • GFortran only as a dependency is a pretty severe restriction (and I personally dislike ctypes, not portable at all, makes for ugly boilerplate calls)

Feel free to join the NumPy slack’s f2py channel if you’d like to discuss these more as well :slight_smile:

Question, when you say “not portable” what do you mean? Honestly, ctypes is the only approach I have found to enable using which ever compiler I want to build the shared library, and have this shared library being independent of the python version. If what you don’t like is the maintenance of the intermediate wrappers that’s one thing and yes that’s true, but for the rest I’m not getting the non-portability

1 Like

Perhaps I am mistaken but it seemed from the documentation that one would have to have different calls for different operating systems. To me this says “not portable” though I suppose it could be seen also as maintenance burden.

In general, I don’t really see how ctypes is a good idea, yes if you know exactly what compiler you used and what arguments and ensure compatibility you can make a call, but this is pretty much the same as glorified object renaming (see @sblionel’s very valid argument against that here)

Sure the Python libraries ctypes and np.ctypes help a bit, make the calls slightly less uglier than mangling symbols and making calls by hand, but it is still essentially the same approach with the same drawbacks. Works for personal projects where users are expected to “read the documentation” and compile accordingly for functionality… Not so useful as a basis for a generic tool.

I also don’t understand what you are saying here. Once you define the C-compatible interface using the C-interoperability features on the Fortran side, you can call it just like any other C routine. It doesn’t matter the compiler or operating system. It’s way less burden than injecting another tool into build or run time. Maybe there is some misscommunication here…

Yes, exactly. I have found the same. Build the shared library (with C-compatible interfaces) and that’s all you need. To me that is way easier and more portable. (vs trying to fight SWIG to get it to do what you want and then deal with extra platform and/or python-specific libs floating around. i’ve never used f2py).

1 Like

I think we’re talking at cross purposes though, as can be seen in SciPy today (and the past 20+ years) the dominant way to interface standard Fortran libraries with Python has been via additional tools because no one has written / maintained wrappers to the underlying codes.

i’ve never used f2py

As a quick primer, what it does is:

  • Generate fortran code (for compatibility)
  • Writes out C code interfacing to the fortran code
    • Importantly writes out Python-C code

Finally the resulting fortran files (including the wrappers) and the C code (which links to Python at compile time) is linked into a nice executable importable module.

This is fully automated as in the example above.

For example, without having to know much about Fortran or Python you can pass arrays or strings or callback functions to and from both languages via f2py. If you have the luxury of bind(c) types you can generate direct interfaces, tested on all three operating systems without handling .dll files specially. Certainly one could argue that everyone should write their own wrappers with well designed APIs exposing exactly the functionality they need. I don’t think the general attitude that ctypes does the same thing is helpful in the least for general binding tools which can and are in use for production Python code.

Build the shared library (with C-compatible interfaces) and that’s all you need.

Generally, python users would be interested in something which you can grab from PyPI. I’m not sure this would work for distributing wheels, but the f2py compiled modules certainly do (e.g. SciPy).

As noted above, if ctypes works for you, it works. However, from the perspective of “will this help people actually keep Fortran in Python” I don’t see it helping at all.

Plus there’s a nice segfault when there’s a failure in ctypes which is never fun (noted by @certik). Perhaps the closest argument is that with bind(c) ensuring uniformity of calls, there is no real benefit (beyond needing to write things by hand) compared to ctypes (assuming that it is trivial to ensure that the dll or so is in the right place at the right time when called from python). However, f2py supports way more things than bind(c), and also works for older pre-bind(c) code as well. It also has nice helpers for numpy arrays in particular.

So, just to be clear, I have no dogma here, just practical experience. When I first found out about f2py I was very happy because I thought it would help me create a python API out of kernels and libraries I maintain for other stuff. But no, f2py (as I understood) was basically behaving as a “build-system” in its own, an imposing 2 key restrictions that I found were a no-go: 1) An upstream dependency of my kernels (which are a mix of Fortran/C/C++) to the given python version and 2) An upstream dependency to a subset of Fortran standards… I guess in summary the problem for my use case is that I’m not writing Fortran code as backend for python, but actually for full-fledge software, the python API is just a subset.

The 2nd point is kind of also true for iso_c_binding, one can not directly bind a derived type which contains allocatables or pointers or other derived types as members… but if I already have to circumvent that for the iso_c_binding interoperability or for f2py, I might just as well do it for the one that imposes the less amount of restrictions. And being able to compile a shared object (well 2, one for windows and one for linux) that I just compile once, create a .whl that I can then install in any version of python from 3.6 all the way to 3.11 was honestly a marvel I never thought possible.

I do think f2py is very practical, I’m not trying to diminish it in any way, but it has constraints that are too heavy if one already has a large machinery running and wants to extract a few bits out of it without duplication of kernels and downsizing to a subset of the language.

At the end the problem of this discussion is what does “the general case” actually mean? we all have our use cases, experiences and biases… could this discussion help maybe bring some of the drawbacks and strong points of each approach together to see if something could evolve?

2 Likes

This is not true, though it could be said to be historically ambiguous. f2py will generate .pyf files from which it will generate .f90 and .c files. At compile time, the user is required to link to a python version to get a python extension module (import blah) which can also be packaged into a wheel suitable for PyPI distribution.

In fact, if the shared library is built with meson, or can be linked in at compile time, there’s no need to recompile the shared library at all. It is just linked to the wrapper.

The confusion comes from when f2py was used with distutils which limited how it could be integrated into other projects. Today, (NumPy 1.26 onwards) f2py will no longer force the use distutils or inflict setup.py files on the user, but will instead generate a meson.build file which can be integrated into an existing project.

  1. An upstream dependency to a subset of Fortran standards

This is true, but as noted it is true for iso_c_binding as well. In fact, f2py supports a wider range of features than the iso_c_binding interface, e.g. F77 support, so arguably the more permissive case here is f2py. Personally it took me a long time to write all the (mechanically similar) pointer and c_ptr interfaces to the trivial derived type shown in the post, so I would prefer to have it automagically generated. The last part is the interaction with python itself, and this is the part which f2py does and is non-trivial to implement by hand (e.g. all the Python-C code in the post) but does make it more “pythonic”.

And being available to compile a shared object (well 2, one for windows and one for linux) that I just compile once, create a .whl that I can the install in any version of python from 3.6 all the way to 3.11 was honestly a marvel I never thought possible.

This is nice but it is actually not true. Yes you can install it in any version of python, but you must ensure the system dependencies are the same for your shared library. This is one of the reasons it can’t be used to upload to PyPI. Also it is non-trivial to ensure LD_LIBRARY_PATH and other variables are setup to have access to the shared library. I know it is common in scientific computing to have environment variables which set the locations of shared libraries but it isn’t a very good practice…

I do think f2py is very practical, I’m not trying to diminish it in any way, but it has constraints that are too heavy if one already has a large machinery running and wants to extract a few bits out of it without duplication of kernels and downsizing to a subset of the language.

I understand, but the point of these discussions is to find good directions for f2py to grow and to try to remove (what I feel are) some slight misunderstandings about its features and use.

At the end the problem of this discussion is what does “the general case” actually mean? we all have our use cases, experiences and biases… could this discussion help maybe bring some of the drawbacks and strong points of each approach together to see if something could evolve?

+1, at the moment, what I meant by the general case would be viewed from the “python” side (e.g. SciPy) instead of the fortran / scientific computing side.

1 Like

No I haven’t tried compiling f2py’s or scipy’s test suite as most of the code isn’t in a module. So I’m not personally interested in supporting it.

Yes a hard dependency on gfortran is both gfort2py’s greatest strength and greatest weakness. Sure I can’t support other compilers on the other hand my life is much simpler as I don’t have to worry about what other compilers are doing.

I’m not sure why you think ctypes isn’t portable, I have 8 lines of code in gfort2py that care about the OS (and that’s just for selecting the extension for shared libraries). It even works on Windows without me having even tried at that point to get it to work on Windows.

Works for personal projects where users are expected to “read the documentation” and compile accordingly for functionality… Not so useful as a basis for a generic tool.

Yet gfort2py works without any modification to the source (so no annotations and no limitations on intent). A user of gfort2py doesn’t even need to know ctypes is, it’s entirely wrapped and transparent to the user. Much like f2py wraps the messiness of building a python-c interface to present something useful to the user.

I don’t think the general attitude that ctypes does the same thing is helpful in the least for general binding tools which can and are in use for production Python code.

Except it can. Sure ctypes on it’s own doesn’t help. But you wrap ctypes and then you can do anything you want. Gfort2py has support for scalars, explicit arrays, strings, assumed shape arrays, alloctable arrays, and derived types. Things not currently supported are more due to my time commitment instead of any fundamental limitations of using ctypes.

No. This is simply not true. Let me put this in a simpler context. Since shared libraries cannot be built and uploaded to PyPI (without going through actually writing Python-C code / the auditwheel process etc. see cibuildwheel for more information), there will never be a pip install version of a library which interfaces through ctypes which means as far as fortran in python as far as the Python ecosystem is concerned it is a non-starter.

Yet gfort2py works without any modification to the source (so no annotations and no limitations on intent). A user of gfort2py doesn’t even need to know ctypes is, it’s entirely wrapped and transparent to the user.

For fortran programmers who want to provide Python bindings, ctypes will be fine. Along with the note from the readme about LD_LIBRARY_PATH.

For more of an understanding of these issues there is the post on SciPy’s build system issues, and the simple matter that as it stands, no user ever needs to compile SciPy’s fortran code themselves, they get it in a binary wheel, which gfort2py cannot do because of a ctypes limitation.

Note that you can build a local wheel and distribute (with a prayer that all the system dependencies are the same) it but it needs to be “compatible” (via cibuildwheel typically) for PyPI distribution.

The limitation has been listed earlier as well, users need to have the library (either the .dll or the .so) at runtime, and this path needs to be correctly set for every new library. This is in sharp contrast to the wheel approach where python handles this automagically.

When was the last time a pip install required setting an environment variable? There are simply very different expectations for different communities. Fortran programmers are expected to handle their libraries themselves, and the wrappers, Python users are largely not.

I have 8 lines of code in gfort2py that care about the OS (and that’s just for selecting the extension for shared libraries).

Compared to the 0 lines needed from having actual C-Python code. I’m not saying it is too hard to support and gfort2py is excellent work. However, we have differing veiws on what counts as portable. Portability is not the same as popularity or ease. That it is simple to have gfortran only doesn’t make it acceptable for cases like scipy or the rest of the Python community, no matter how useful and elegant it is for the fortran programmers who use only gfortran.

I do want to be clear about the fact that gfort2py is a good project with clear goals and does the Fortran community a lot of good. However, ignoring the real reasons why it is not acceptable for more general use is also not a great attitude.

Another point is closer to the compiler issue. As noted here: https://fortran-lang.discourse.group/t/the-difficulties-of-using-fortran-in-scipy/

One of the main issues is the lack of compatible Fortran compilers. This means that gfortran on Windows depending on where it comes from will be unable to link correctly to the C libraries… but this is an issue with compiling the library… until you try to share the .dll file with someone else and they can’t find symbols because their gfortran is different…

It is however impossible to describe the problems of portability like this. A practical example as mentioned, is to see if it is possible to get a pip install friendly binding + library out of this method, and then to see if it can be reliably built across the ~53 different configurations NumPy supports.

Which I never did, my original post was simply pointing out that f2py is not the only game in town. I have never claimed it would be a replacement for use in scipy.

No. This is simply not true.

Here I was simply discussing the implementation of Fortran features not the annoyance of getting libraries working on different machines.

1 Like

For use by the Python ecosystem it is, because:

the annoyance of getting libraries working on different machines.

Is a large part of working in Python. Also it isn’t just that the libraries are hard to install, its also that ctypes is fundamentally unable to work without the shared library at hand, so it cannot be used generally.

Just FYI: it may be true for pip (I don’t know what auditwheel, cibuildwheel is, frankly), but not for conda. All these problems are solved by conda (i.e., you build the libs yourself and people can install them just fine for each platform and you don’t have to mess around with LD_LIBRARY_PATH).

That’s what I do. I can write Fortran code of any complexity, and compile it as a shared lib with whatever build system I want. That includes manually writing interfaces using the C-binding feature of Fortran (which isn’t really that much more work than writing a C header file, for example). Then I can call it from any version of Python I want. It is deployed to the users as a conda package which takes care of the dependencies and the dll/so/dylib path issues.

1 Like

Yup, conda / spack nix etc. will all take libraries built this way. Even lmod can be used to hack paths semi-automatically (as is done in most HPC centers). Or direnv will also automatically change path variables based on your directory so you can have your paths the way you want them…

However, I see that as a Scientific community thing and is fundamentally off-topic here IMO. Python as used commonly is the focus of my post. This means pip installable libraries or things which help the existing ecosystem (e.g. NumPy / SciPy). Other approaches are either stop gap solutions or may miss the point completely. I don’t want to get into the many problems of conda or its ecosystem (because it would take us too far afield, but again, it has been discussed quite often in the SciPy / python packaging ecosystem).

My favorite example of this logic is always the symbol renaming example, which can be used to interact with basically anything anytime but is never a good idea. Everything else forms a spectrum between fully supported and hacking compiled objects.

As another example, it is why even the excellent libraries you maintain haven’t made their way into SciPy. It is of course unfair to expect the community here to work directly on F2PY (as an example) but it would atleast help to recognize what it brings beyond what is currently practiced.

The only reason I am belaboring the point is that if the efforts were made to integrate from both ends, it would be more productive. The point of these posts is to try to get more of the Fortran community interested in opening issues w.r.t F2PY with the goal of getting modern Fortran into SciPy and just more generally acceptable.

Honestly the community here has the most positive opinion of ctypes I’ve ever seen. Even the NumPy documentation is quick to point out flaws:

The source code itself better explains why it isn’t considered to be too great.

Now these may not have been encountered by people here yet, but these warnings are there for a reason, NumPy has a lot more users, and so sees a lot more sharp edges of the ctypes interface in general.

1 Like

Well I think you are underestimating the use cases of conda here. It solves significant problems that the rest of the Python packaging dumpster fire does not. The scientific community are the people using these libraries so it isn’t an insignificant thing.