I have an idea to create a fortranlang python package (name TBD). The idea would to be to make high performance Python wrappers to some Fortran libraries, including Minpack, ODEPACK, and QUADPACK.
You might be thinking… "scipy already wraps these libraries???". Yes, Scipy does, but an issue with Scipy is that their wrappers are pretty slow, and can not be called within numba compiled functions (numba is a python compiler). It is possible to make much faster python wrappers using ctypes (although these wrappers are less flexible). These wrappers can be called from within numba compiled functions. Therefore, you can get massive speedups over scipy.
For example, I’ve put together the numbalsoda python package, which wraps the LSODA routine in ODEPACK. Right now the package uses a C++ version of LSODA because ODEPACK isn’t modernized quite yet (see progress by @jacobwilliamshere). But I’d like to switch to a Fortran once a modernized ODEPACK is available. numbalsoda is up to ~100x faster than scipy’s wrappers of LSODA (benchmark here).
I think this would be (1) a very useful and popular Python package and (2) raise awareness in the scientific community for how useful Fortran can be.
Wondering people’s thoughts! All brainstorming welcome.
I tend to prefer smaller, well focused libraries. But perhaps that’s not a major concern in Python? And perhaps the *PACK libraries aren’t all that well focused to begin with? Mostly just food for thought, but I wouldn’t initially jump to combining multiple Fortran libraries into a single Python package.
Eh, I mean, are you saying like for small ODE system, like just 3 ODEs or so, the C version LSODA is faster than the Fortran’s ODEPACK version of LSODA by 100x?
I mean if both with -O3 -xHost optimization flag, C version and Fortran version seems should not have 100x difference, LOL.
Or, is the 100x difference is mainly caused by Python?
In your opinion, in terms of speed, the new modernized ODEPACK .vs. current F77 version of ODEPACK, how much speedup can the modernized ODEPACK expect to get?
I’m much in favor to provide language bindings for libraries we maintain under Fortran-lang. If done right it is not even noticeable that Fortran is used in Python script (apart from the speed, of course). I wouldn’t call the Python package fortranlang however.
For minpack there is an open issue for discussing Python bindings
Usually I put Python bindings in a subdirectory python/ in my projects setup in a way that they can either be built directly with the main project or as a standalone against an existing installation.
My main issue is with ctypes, the handling is quite low-level and requires a lot of repetition, while you don’t get any useful error checking on the Python side. I usually go for wrapper generators like CFFI or SWIG or something which has more direct access to C like Cython.
The reason why I have used ctypes is because it is numba compatible. All Python code can be compiled and be very speedy. This sets it apart from Scipy. Cython is not numba compatible. CFFI is numba compatible, but I’ve never used it. Look very similar to ctypes? What is its advantage over ctypes?
CFFI has multiple modes, the ABI modes are similar to ctypes, but a bit more powerful because you have typedefs available while you only get c_void_p for all opaque handles in ctypes. Also, the dlopen step in ctypes is pretty fragile, because you must know where the library is you want to load and the Python module might not be installed in the same prefix as the library wrapped.
I’m mostly using the out-of-line API mode in CFFI, which allows to generate a fully functional extension module directly from a C header, meaning you don’t have to duplicate the header definition in the Python source code. With some extra effort you can even link the library you depend on statically into the extension module and are almost dependency free (useful for producing wheels).
Ya i think performance gains could be made to the contd8 interpolation routine in dop853. contd8 considers a single variable, but it could consider the interpolation of all variables at once. I think this would permit the compiler to make some additional optimizations.
Most of the especially fancy stuff is for large or stiff systems. When you have small nonstiff equations there’s not much you can do to speed up (even python’s Jax can compete for stuff like this). It’s also not really correct that FLINT is in general faster here. It is faster in the case with the specific type of callback used because DifferentialEquations.jl currently doesn’t yet have a way to specify a callback that you only want to hit once. For the one without callbacks, the algorithm Julia selects by default for this case (Vern8) is as fast any of the FLINT solvers.