About f2py, IPC and Wrappers

rgoswami · September 14, 2021, 1:19pm

I was wondering what the general view of the community here is with regards to f2py.

In particular it’d be fantastic to hear:

About attempts to use f2py
- What went well / went wrong (other than derived types)
- When was the last attempt?
- What was the project / background / context?
What was missing (documentation?)
- What could alleviate the above?
Do you know of projects which use f2py?
- Do any of them use f2py without numpy?
Are there any strong feelings (for / against) the .pyf format?
- Also the C notation in comments

To answer some of the above:

SciPy uses f2py extensively
Sphinx-fortran uses crackfortran from f2py without numpy AFAIK
MsSpec uses f2py
SEAPY
wrf-python
Slycot
PyHyp
QUIP uses f90wrap

Also beyond f2py, but:

What role do wrapper generators have in the future of Fortran?

The last one is of particular interest to me due to my involvement in the fantastic LFortran effort (which will generate automated Python wrappers due to @certik and @hsnyder) among other things.

Disclaimer: I have started working to support the development of f2py with @melissawm and Pearu Peterson

awvwgk · September 14, 2021, 1:45pm

I’m mainly using CFFI from the Python side and bind(C) from the Fortran side to export APIs from Fortran to Python. While I have to write and maintain a C-API for this purpose, I consider this “free” C-API a plus, since it allows interfacing to other languages as well by one uniform API.

Regarding f2py, the missing support for derived types (when I last checked) ruled it out from the very beginning, as my C-APIs are almost always based on objects rather than on procedures. Also build system integration is usually a huge point in using such tools, I wasn’t able to figure out whether I can use f2py from meson without having to introduce too many hacks.

Maybe interesting in this context is the following thread, which covers many different opinions on interfacing Fortran and Python:

nicholaswogan · September 14, 2021, 5:07pm

My main fortran project is a model of atmospheric chemistry called PhotochemPy. It relies heavily on f2py. To build I use CMake with skbuild, which I think is a great build system (simple examples if you are interested).

Context: PhotochemPy is my attempt to modernize a Fortran 77 version of the code called Atmos. There are a bunch of annoying things about the Fortran 77 version. For example, all array lengths are set at compile time, so if you add a molecule or reaction, you need to recompile the whole program.

What went well: I like f2py because wrapping is automatic - when I add a new module variable, f2py will automatically create the setter and getter functions so I can access that variable from python. This is essential for projects that are changing quickly. I don’t want to spend a ton of time creating a wrapper, which will be broken next week because of changes I made to the code.

What went wrong (excluding the lack of derived type support):

It took me so long to discover and to figure out how to use the skbuild + CMake + f2py build approach. CMake is really nice here because I can compile project dependencies (written in C or C++ or whatever), then link them later when generating the python wrapper. Again, here is a simple example
character, allocatable module arrays don’t work with f2py
Numba can not call fortran code wrapped with f2py. You must use ctypes instead.
control-c doesn’t stop fortran called from python
callback errors are possible but not easy.

What role do wrapper generators have in the future of Fortran?: I think they have a very big role for scientist who model stuff. Graduate students in science usually start with very little experience in coding. Usually they will have some experience with Python or Matlab. But they don’t know what a compiler is, or understand the importance of types (because they started with python), or understand pointers, or objects, etc. This is all understandable because their main job is to know math and physics. I speak from personal experience here.

However, their advisors give them an old Fortran codes to work with. Given their lack of familiarity with Fortran, they will want to interact with this code via python. This is understandable. Scripting in python is so easy and intuitive, and will be hard to beat (even with LFortran). However, most often graduate students use a file-wrapping approach: in python, they write a file with some inputs, then run a Fortran executable which reads those inputs and does stuff, and then makes an output file, which is read by python. This sucks for a bunch of reasons.

It would be great if graduate students (and other researchers) had a wrapping tool like f2py, but just worked all the time with absolutely excellent documentation aimed toward the graduate student who doesn’t know anything.

Side Comment on documentatin: One of the great drawbacks of Julia is that the documentation IS NOT aimed at the graduate student who knows nothing about computer science. They basically assume you come from a C or C++ background. There is where LFortran and fortran-lang can shine. Documentation and user experience should be aimed at computer science dummies like most scientists.

rgoswami · September 14, 2021, 9:34pm

Thanks so much @nicholaswogan for the detailed breakdown! I 100% agree that wrapper tools are more familiar for science grads; and though the ISO_C_BINDINGS are great, not everyone will be interested in learning to use them.

One of the strengths of f2py might lie in the flexibility (i.e. by being a code generator instead of being a compiler it can take more opinionated decisions at times)…

Documentation is a major focus area, build systems should be up sometime this week.
- I’ll also write up some callback error examples
Thanks for pointing me to the character, allocatable and the Numba issues
I’m not sure the CTRL-C can be fixed by f2py; does this happen for ctypes as well?

P.S. PhotochemPy seems fantastic

nicholaswogan · September 14, 2021, 10:28pm

I criticize some but overall f2py is an incredible tool. Thanks for your work on it.

I’m not sure the CTRL-C can be fixed by f2py ; does this happen for ctypes as well?

CTRL-C works with ctypes. f90wrap says they got CTRL-C to work with a modified version of f2py? Never used it though.

certik · September 14, 2021, 11:42pm

Here are links to the relevant efforts in LFortran for creating Python wrappers automatically:

Automatic wrappers Fortran -> Python (#133) · Issues · lfortran / lfortran · GitLab
Pywrap basic functionality (!1307) · Merge requests · lfortran / lfortran · GitLab

Once LFortran matures and can compile most Fortran projects, the workflow that I would like, as a user:

Use LFortran to compile my code (using the LLVM backend for example), ensure that everything works
Switch to the pywrap backend and generate Python wrappers. At that point we know LFortran fully understands the code, and thus any remaining bugs must be in the pywrap backend.
I would like LFortran to wrap any modern Fortran code features, such as classes, derived types, coarrays, etc.

I think there could be a huge opportunity to collaborate with the f2py effort on this. If the f2py maintainers are interested in collaborating on this, I am all for it.

zaikunzhang · September 15, 2021, 2:28am

With my Ph.D. student, I have been working on PDFO (GitHub - pdfo/pdfo: Powell's Derivative-Free Optimization solvers.), which provides Python/MATLAB interfaces for the optimization solvers developed by late Professor M. J. D. Powell FRS (A Memorial Page Dedicated to Professor Powell). The current version (v1.1) uses f2py for the Python interface, and it works great — thank the f2py developers for this wonderful tool!

In addition, I am working to re-implement Powell’s code in modern Fortran, aiming to comply with Fortran standards 2003+. Powell wrote his code in Fortran 77 with a unique style featured by numerous GOTO’s. The new version will be modularized, GOTO-free, and almost loop-free by exploiting the matrix-vector operations supported by modern Fortran. Since the project is still ongoing (and it takes quite some time), I cannot reveal the code for the moment, but I will make an announcement here immediately after the first version is ready. Indeed, all the discussions I initiated here originate from this project.

Beliavsky · September 15, 2021, 3:11am

There is PowellOpt of @jacobwilliams which works fine with gfortran and Intel Fortran. It still has many GOTOs.

zaikunzhang · September 15, 2021, 3:18am

Yes, I am well aware of the wonderful project by @jacobwilliams .

Indeed, I was close to Professor Powell when he was still with us. He was the Ph.D. supervisor of my Ph.D. supervisor. My research is highly influenced by his. During his final days, he asked me to maintain his code when he is not there. That said, my project is a personal promise and commitment I made to Professor Powell.

I sometimes compare my project with the job of translating/interpreting/annotating a classical work like Euclid’s Elements . Doing this, I wish to keep Powell’s solvers understandable, accessible, maintainable, and extendable to everyone, not only the experts; or, simply keep them alive. Very few people remember who translated Elements , but it is a job that must be done. I see it as my mission since Professor Powell asked me to maintain his code in his last days and I promised him that I would commit myself to the maintenance. In addition, I know Powell’s algorithms and code very well — they belong to a research area that I have been working on since my Ph.D. days.

certik · September 15, 2021, 4:31am

There are (will be) a lot of projects like that. If you enjoy that kind of work, you might also enjoy helping us maintain GitHub - fortran-lang/fftpack: Double precision version of fftpack. We’ll add more libraries like that as we go.

You have expressed quite nicely how I feel about our goal for those projects. I simply want it to be maintained and modernized to compile without warnings etc., but not changed. Similarly, if you wanted to actually changed Elements, then it should be called differently.

(Regarding fftpack, I also want to have a separate project with some more modern / faster implementation of fft. That way we have the “classical” library (fftpack) and a new library that is possibly better, as two separate packages.)

zaikunzhang · September 15, 2021, 4:50am

Hi, @certik ! Thank you very much for your comments.

We have exactly the same goal! It is great that we see things in the same way.

For your reference, I would like to quote my reply to an issue raised by a user of PDFO, in which he requested a modification to Powell’s code. I am more than sure that the modification will be beneficial (again, it is my research area), but I rejected his request for the moment and promise to do it when I finish the modernization.

You may see the discussions here: COBYLA: increase RHO · Issue #4 · pdfo/pdfo · GitHub . I hope you also allow me to copy-paste them below.

Hi,

We need to improve the performance of COBYLA when the TR was reduced too much. This kind of improvement has been added in the NLOPT distribution of COBYLA.

I have made a dirty modification here GitHub - FrancoisGallard/pdfo at ft_increase_tr
cobylb.f, Line 764:
C Increase TR
      IF (TRURED > 0.0D0 .AND. TRURED >= 0.9D0*PREREM)
            RHO=2*RHO
      END IF
This seems to work in our environment and provides big gains in some situations (4x faster on our test case).

However we have our own stopping criteria (xtol, ftol) outside COBYLA and I think this modification breaks the stopping criteria in COBYLA.
Maybe it would be required to add such criteria in this implementation of COBYLA in case this option is active.

What do you think?

Best,
François
Salut François,

Thank you very much for your comment. I agree that increasing the trust-region radius is a good idea. Indeed, Powell introduced a strategy to increase the radius in all the other derivative-free solvers of his (UOBYQA, NEWUOA, BOBYQA, and LINCOA). I believe that he would have done the same in COBYLA if he had implemented it later.

I am very happy to know that the simple change you made can lead to significant gains. However, on our side, we will keep Powell’s code “as is” (for the moment).

Indeed, we are working on a re-implementation of Powell’s solvers in modern languages — first Fortran 2003, and then MATLAB, Python, and others. It will be a totally faithful re-implementation. The new code will be equivalent to Powell’s, not only mathematically but also numerically. In other words, the new code will not change Powell’s algorithms at all, and it will produce exactly the same results as Powell’s. The focus is to implement Powell’s solvers in a modulized and structured way so that they are readable, maintainable, and extendable . The new code will have no GOTO (of course) and will use matrix-vector procedures instead of loops whenever possible.

Before finishing the re-implemetation, we refrain from making any change to Powell’s code on our side. Due to the unique coding style, Powell’s code has extremely high complexity — I would say that nobody can see the true complexity before trying to disentangle the GOTOs, which I am currently working on. Consequently, changing one place will inevitably affect many other places (e.g., the stopping criteria you mentioned), sometimes in a way that is hard to imagine, as I am encountering every day during the re-implementation. Making changes to such code is like introducing new components in a machine that has already excessively many interconnected wires, buttons, knobs, and switches. It may do some good, but I guess it is better to postpone the changes until the machine is modulized and the structure becomes clearer.

The re-implementation is not trivial due to the complexities mentioned before. Recall that we want to have a re-implementation that is totally faithful. Hence intensive tests are needed after writing almost every line of the code , which is quite time-consuming. I have finished the Fortran 2003 version of NEWUOA, and currently working on COBYLA. It has taken me two summers. I hope to finish all the solvers before the end of next summer. I will let you know when it is ready.

Many thanks for your interest and attention. We hope our work is helpful to you and to others.

Best regards,
Zaikun

certik · September 15, 2021, 5:08am

Indeed. There is a lot of value in doing exactly that. “Classical code” is trusted over the years, a lot of people have a lot of experience with it, it is frequently used in comparisons and referenced or mentioned in new codes etc. Elements is a great example of that. It is strictly speaking “not needed” in modern mathematics as more modern and simpler proofs are now available, but at the same time there is nothing wrong with it. It’s a masterpiece. And maintaining such masterpieces is our duty. Assuming we can do it without stopping progress on modern codes.

And as you encountered in PDFO, it’s a question of “art” and “good taste” as well as “engineering” in what changes should be made and what not. In your case, the change seems simple, so after you are done with refactoring, you might decide to treat it as a “bug fix”. But if it was an extensive change, then you might decide to never do it, and rather start a separate library for that. Sort of like repairing a 600 years old bridge: you want to use modern engineering, but keep the old look and feel. And if you want a modern bridge (that can handle more load, is wider, taller, or possibly can last longer), you can build a new one elsewhere.

zaikunzhang · September 15, 2021, 5:20am

Thank you very much for your efforts to modernize these fundamental packages. As a computational/applied mathematician, I understand very well the importance/impact of these packages.

However, to be totally honest, I would not say that I enjoy this kind of work very much (sorry). Sometimes, we have to do things that are hard and not enjoyable. Indeed, coding is not a job that is quite appreciated in my community — even though it is much more important than most researchers can imagine! Like it or not, the mathematical community appreciates theories more than code — it is the case even for computational/applied mathematics. Anyway, as a junior university professor, I am supposed to do research, i.e., inventing new methods/theories other than polishing some existing ones. Without doing that, I will literally lose my job, which I rely on to support my family; even worse, I would lose my reputation in my research community if my colleagues do not see me producing new stuff. In addition, as a mathematician, I do enjoy theory much more than coding, even though I fully appreciate the importance of the latter.

No matter how much time/effort it will take, I will endeavor to modernize Powell’s code, because he was my “Grandpa” and I had promised to him before he left. For other code, I sincerely apologize for not being able to spend time on, because my time and capability are limited and I have missed too much of my research and life while dealing with old-style Fortran code.

zaikunzhang · September 15, 2021, 5:37am

Totally agree. Bugs must be fixed. For example, we have fixed some infinite loops and segmentation faults in Powell’s code due to NaN (when NaN occurs, comparisons always return .false., and hence some conditions may never be met numerically). Small and straightforward improvements may be included provided that they do not change the algorithm. Significant changes should only be made under a new name.

I agree with your point. On the other hand, the situation for Powell’s code might be a bit different since it is not that old yet. The solvers are still (close to) the state of the art. They are widely used by practitioners in various areas, and they are still frequently studied by the research community, either as benchmarks or as the research topic itself.

Topic		Replies	Views
SciPy: "there is interest in adopting this (PRIMA) implementation, but little appetite for taking on more Fortran code"	28	11671	February 2, 2024
Interfacing Fortran code from Python Help	16	5491	October 6, 2021
The difficulties of using Fortran in SciPy	98	5331	April 13, 2024
Poll: refactoring a chunk of legacy code Poll	56	2886	February 28, 2022
Why is C used in many optimization libraries? Help	17	947	January 30, 2022

About f2py, IPC and Wrappers

Related topics