Types from Fortran to Python via Opaque Pointers

rgoswami · November 15, 2023, 1:06am

Conda packages are not compiled for performance typically
- Even the compiler stack is pretty flaky (since they muck around with the names)
They vendor old versions (e.g. gfortran on Windows is outdated for years now)
Some of these sharp edges are in the other article some are here on the NumPy docs (older)
More on general scientific software and python is in the packaging docs
It is very easy to get a shared library from conda which expected something on your system and will subsequently not work (I think biber is still a good example of that)

For more on why maybe conda isn’t great for scientific use see Spack (which I use often) and Easybuild (similar enough) which are really more robust, since they allow variants and different compilation options…

Since we are now far enough away from the original topic, I should point out this is not unique to Python. CRAN (for R) has strict restrictions on (among other things) the kind of compiled code you can distribute. For this, biologists have Bioconda (also for python I guess). However, R packages still develop with CRAN in mind typically, because that’s the ecosystem.

Similarly, Python ↔ PyPI, regardless of the many other ways you can get some kind of Python interaction / setup.

Here is more on pip and conda.
This is the Easybuild comparison page

Conda is a package manager that runs on Windows, macOS and Linux, and is very popular in the scientific community.

It focuses on quick installation of software and ease of use, and lets users create a conda environment in which they can install one or more packages. These packages are usually pre-built generic binaries however, which may significantly impact the performance of the installations.

Despite wide adoption in the scientific community conda is not a good fit for HPC systems for a number of reasons, including poor support for multi-user environments, a lack of focus on performance, heavily relying on the home directory (which usually is limited in size on HPC systems), and more. There is also no guarantee that it will install libraries that are compatible with the hardware of the cluster you’re working on, so the Conda-installed software may not always talk properly to the cluster interconnect or resource manager. See this link for a more detailed discussion.

In addition, software installed via conda usually does not mix well with software installed through environment modules.

^ Relevant parts extracted. How much this matters to you as a developer / user will differ from person to person. In my day to day work I need high performance. I use nix on HPC or compile the world via spack with whatever compilers best suite the machine (once I did this by hand)… Though that is far away from the average python user trying to pip install scipy to fit a spline or something. I rarely use Windows either, but that doesn’t mean I would be willing to say it isn’t of relevance to anyone because there are better alternatives…

ivanpribec · November 15, 2023, 1:18am

I found your article interesting. Generating “native” Python extensions in C is a worthy cause. The PyCapsule appears to fulfill the same purpose as assumed type type(*) objects do in Fortran.

jacobwilliams · November 15, 2023, 3:12am

Spack and nix don’t work on Windows though (and no, WSL is not windows).

rgoswami · November 15, 2023, 4:02am

When in Rome… For windows one should use Chocolatey or msys2 or something. WSL is sadly a non starter. The scipy docs has a list… I think rtools is the easiest compatible tool chain since it has Fortran and C compilers bundled together.

It is still unclear to me why one would prefer glorified symbol hacking to actual bindings. From needing just a one time run of f2py to generate C, and Fortran which can be built on any operating system and used to generate actual compiled python extensions we’ve apparently been discussing the need to use conda to manipulate paths on windows? The original argument of ctypes being more desirable than actual compilable code is still beyond me, but again. To each their own.

Using a standard library is not the same as generating native code. This I hope is clear. Ctypes does what it does by basically calling symbols from shared libraries… The equivalence of ctypes compared to a compiled extension module is a weird starting point to begin with.

Enumerating more ctypes failure modes is unlikely to be useful but also consider the issue of propagating errors. There’s no graceful raising of a python error in a ctypes wrapper because ctypes will segfault and not be able to propagate anything of use. e. g. Stackoverflow

Honestly I would say it is better to provide a good CLI and use subprocess run instead of ctypes… That’s more useful and can atleast have better error handling.

rgoswami · November 15, 2023, 4:22am

Glad to hear that, most of Fortran C side of things was directly inspired by your posts here

The PyCapsule appears to fulfill the same purpose as assumed type type(*) objects do in Fortran.

Yup, they’re not very well documented for some reason though… (in general C Python documentation is less than stellar).

jacobwilliams · November 15, 2023, 4:36am

I’m not sure we’re even talking about the same thing.

What do you mean by “symbol hacking”?

Is ctypes not the standard way to call C code from Python? Presenting a C interface (using iso_c_binding) in a Fortran library is the same thing as C code as far as Python is concerned. What is the problem with that? (Yes I understand the exception issue…but you just have to make sure you write good Fortran code that handles errors internally, same as you would any other compiled language).

I think probably you are from the perspective that the Python is the real code and Fortran is sort of a sidekick…but I think the opposite. The real code is the Fortran code. That’s the code that is performant and will be used for decades. Python is a convenient wrapper that will probably one day be replaced by some other language. But the Fortran can live on. If it’s callable as C it can be used by any language with no need for bespoke build tools, it will work on any platform as long as you have a Fortran compiler, you can compile it with whatever build system you want, and it will never stop working (as Fortran never breaks backward compatibility, which cannot be said of Python). How the library is installed or what language it is called from are just details. A “compiled extension module” (whatever that is) is of no use to any language but Python.

rgoswami · November 15, 2023, 4:55am

OK I think this is where the main confusion lies. The short answer is no. ctypes is part of the standard library but it is a barely maintained wrapper around calls to objdump (e.g. here) and similar tools which are meant to interact with shared libraries when you don’t have the source. As discussed in multiple instances the Python ecosystem itself doesn’t support ctypes usage well (actually at all) and it is never in production anywhere (because shipping blobby libraries is a thing of the thankfully gone past).

The correct way to interface C code and Python is to create a C-Python extension module [see a caveat at the end of the post though], also called a “native” extension. Another way to think about this is that Python is just a C library itself. If you look at the post, the construction of Python objects and classes is all done via the C-Python interface. This is the correct way. This is how, for example (well via the Pybind11 wrapper generator) PyTorch (a rather well known C++ library) is used from Python (actually more used from Python than C++!).

In fact, no need to go all the way to C++. NumPy is basically a C library. I hope it is clear to everyone that it is not interfaced to Python via ctypes.

So in that sense, I hope it is now clearer, why f2py takes such pains to take fortran code, and write C-Python wrappers to it.

What is the problem with that?

Just to reiterate:

Portability (Windows + requiring libraries)
Speed of execution (ctypes calls is slower)
Inherent inability unable to handle errors

(Yes I understand the exception issue…but you just have to make sure you write good Fortran code that handles errors internally, same as you would any other compiled language).

No this is a separate issue. Having a good exception in Fortran is only good if you can send it meaningfully to the caller! As an example, if python is used as a server and calls for some heavy computational task, so they want to offload it to Fortran, they cannot if the only interface is via ctypes because any errors in input or otherwise would crash the interpreter. Libraries are generally designed to not segfault, and any segfault is bug typically except with ctypes wrappers where nothing can be done, hence they are not a “best practice” by any means.

In fact, think a bit more about the exception issue. The real reason you get a segfault there is because there’s no C code which can handle the exception for Python, because it was just a call from a shared library. When you have (either hand-crafted or generated) C-Python code you can catch the exception raised from Fortran and translate it to a Python error which can in turn be handled by the user in whatever manner they choose without shutting their whole interpreter down.

I think probably you are from the perspective that the Python is the real code and Fortran is sort of a sidekick

No I fully believe the Fortran code is key here. What I also believe though is that the wrappers should be efficient and allow all the advantages of the wrapping language, i.e. what I would like (as I try and perhaps failed to show in the post) is a robust mapping of a Fortran derived type to a class in Python. All the standard python glue code without the fragility of segfaults / path manipulation.

If it’s callable as C it can be used by any language with no need for bespoke build tools, it will work on any platform as long as you have a Fortran compiler, you can compile it with whatever build system you want, and it will never stop working (as Fortran never breaks backward compatibility, which cannot be said of Python). How the library is installed or what language it is called from are just details. A “compiled extension module” (whatever that is) is of no use to any language but Python.

But this is the same reason I’m advocating for the (automated) generation of C-Python code from Fortran… For what its worth, the bindings generated by f2py are backwards compatible through python versions, sure you need to link to one version at compile time but that’s true of every library.

To me then:

Fortran code should have a C wrapper
The C wrapper should have a Python binding (natively)
The wrapper should be compiled and linked to the Fortran code

The caveat I mentioned above is that technically C-Python and extension modules are meant only for, well, “reference” python (the one you get from python.org), not the “language” (e.g. PyPy / JPython etc.) but no one needs to ever discuss those (they won’t work anyway with either approach).

To be clear, as noted a couple of times, I’m not arguing against a C wrapper for Fortran code and noted earlier such ISO_C code is already supported by f2py in NumPy 1.26 onwards, except c_ptr. I’m arguing against ctypes as a good way to interface any C / C++ code to Python. This is what I meant by there being no other community with so much faith in ctypes. There are hundreds and thousands of C and C++ libraries which offer Python wrappers and I’m almost 100% sure none of them offer a ctypes interface (certainly one cannot via PyPI).

TL;DR If you have (a) the source code for a C library (or any language with a C wrapper), write a native extension. If you have control over code (b) but don’t want to write a native extension add a CLI and use subprocess.run. If you have neither a CLI nor C experience and somehow (c) have a shared library without source code… Then ctypes will have to do. Where the Fortran community generally wants to fall on this spectrum will depend on every individual. As noted, for Fortran, there’s f2py if you have Fortran sources and want (a) without writing hand-crafted C code.

jacobwilliams · November 19, 2023, 5:06pm

This is all great information. Thanks! I wish any of this was mentioned in the actual Python documentation.

When I look at extending-python-with-c-or-c one of the first things I see is:

Note The C extension interface is specific to CPython, and extension modules do not work on other Python implementations. In many cases, it is possible to avoid writing C extensions and preserve portability to other implementations. For example, if your use case is calling C library functions or system calls, you should consider using the ctypes module or the cffi library rather than writing custom C code. These modules let you write Python code to interface with C code and are more portable between implementations of Python than writing and compiling a C extension module.

There’s no indication whatsoever in the ctypes documentation of any of this either.

In any event, extension modules look hideously complex. So now I do finally understand why you would want to use f2py to automate it!

rgoswami · November 19, 2023, 6:23pm

Yup I noted this in the earlier post with the however it is really a very annoying note, given that even the most mature alternate Python implementation (pypy) actually has a fork of NumPy and a specialized compatibility layer for every function of a particular NumPy version…

The state of “alternate python implementations” is generally such a mess that it isn’t what people think of when they talk about Python.

There’s no indication whatsoever in the ctypes documentation of any of this either.

Yup, but it also hasn’t been touched in many years.

In any event, extension modules look hideously complex. So now I do finally understand why you would want to use f2py to automate it!

I’m glad we were able to reach here, I was also very confused for so long about the ctypes advice here

certik · November 20, 2023, 4:08am

Yes, in addition to what @rgoswami said, ctypes has many old bugs that we encountered in LPython regarding structs and alignment / packing. It’s very complicated because C struct alignment is very complicated and platform dependent. Overall if you want robust and fast Python wrappers, you have to create an extension module (i.e., C code that uses the Python C/API and link everything together).

I will point out that even though @rfarmer’s gfort2py currently seems to use ctypes, the tool could be upgraded/rewritten to instead create an extension module, so there is no fundamental limitation in the approach, except depending on GFortran’s module format, which was not meant to be used by other tools like this (but it can be used at least to some extent, since the mod file must at the very least contain information about the public interface to the module, for the use by gfortran itself). We are planning similar capability in LFortran, we already have a prototype working, see lfortran pywrap, we currently generate Cython wrappers, but I am hoping we could merge efforts with f2py instead.

I think compilers should take care of the Fortran side (correct parsing, semantics, etc.), and then a tool should pick it up and wrap it correctly to any language.

Topic		Replies	Views
Interfacing Fortran code from Python Help	16	5178	October 6, 2021
About f2py, IPC and Wrappers	13	1767	September 15, 2021
Fortran OOP from Python Help	2	610	May 28, 2022
bind(C) philosophy: pointer-by-value or type-by-reference? Help	12	1007	July 3, 2022
Bridging Fortran and Python: A Practical Mini-Course and Resources Tutorials	1	364	November 29, 2024

Types from Fortran to Python via Opaque Pointers

Related topics