Optimization Without Using Derivatives: the PRIMA Package, its Fortran Implementation, and Its Inclusion in SciPy

[Update: PRIMA is hiring!]

As mentioned two weeks ago under the thread of LFortran, I have been developing a package named PRIMA for solving general nonlinear optimization problems without using derivatives. @certik suggested that I should write about PRIMA, especially its inclusion in SciPy.

What is PRIMA?

PRIMA is a package for solving general nonlinear optimization problems without using derivatives. It provides the reference implementation of Powell’s renowned derivative-free optimization methods, i.e., COBYLA, UOBYQA, NEWUOA, BOBYQA, and LINCOA. The “P” in the name stands for Powell, and “RIMA” is an acronym for “Reference Implementation with Modernization and Amelioration”.

PRIMA is a project that I have been working intensively on for the past three years. Almost all questions I posted on this forum come from PRIMA.

Who was Powell?

Michael James David Powell FRS was “a British numerical analyst who was among the pioneers
of computational mathematics”
. He was the inventor/early contributor of quasi-Newton method, trust region method, augmented Lagrangian method, and SQP method. Each of them is a pillar of modern numerical optimization. He also made significant contributions to approximation theory and methods, but I hesitate to mention more details as it is not my area.

Among numerous honors, Powell was one of the first two recipients of the Dantzig Prize from the Mathematical Programming Society/Society for Industrial and Applied Mathematics (SIAM). This is considered the highest award in optimization.

Why PRIMA?

Professor Powell carefully implemented his derivative-free optimization methods into publicly available solvers. They are widely used by engineers and scientists. For instance, see Section 1 of a recent paper on Powell’s solvers as well as the Google searches of COBYLA and BOBYQA.

However, Professor Powell’s implementation was done in Fortran 77. The code is nontrivial to understand or maintain, let alone extend. For many practitioners, this has become an obstacle to exploiting these solvers in their applications. More seriously, bugs keep emerging, but few of them get really fixed — it is highly challenging and indeed frustrating (if not depressing) to debug a maze of 244 GOTOs in 7939 lines of classic-style Fortran 77 code, especially if you do not know the sophisticated algorithms behind the code, which are nontrivial to understand by themselves.

Before he passed, Professor Powell had asked me and Professor Nick Gould to maintain his solvers. This is an honorable mission. To make the solvers more accessible, I started PRIMA. It is a project somehow similar to the translation, interpretation, and annotation of Euclid’s Elements. It will make Powell’s solvers easily understandable to everyone, not only the experts. Few people remember who translated Elements, but it is a job that must be done.

Objectives

PRIMA aims to provide the reference implementation of Powell’s methods in modern languages, first in modern Fortran (F2008 or newer), and then in MATLAB, Python, C++, Julia, and R. It will be a faithful implementation, in the sense that the code will be mathematically equivalent to Powell’s, except for the bug fixes and improvements made intentionally.

The focus is to implement these methods in a structured and modularized way so that they are understandable, maintainable, extendable, fault tolerant, and future proof. The code will have no GOTO (of course) and will use matrix-vector procedures instead of loops whenever possible. In doing so, PRIMA codes the algorithms in a way that we would present them on a blackboard.

(N.B.: There are plenty of discussions on whether we should use matrix-vector procedures or loops in Fortran. Here, the former is the correct one by all means, as long as we note one fact — in derivative-free optimization, the dominating cost comes from the function evaluations implemented by the users rather than the numerical linear algebra of the solver. Each function evaluation takes minutes or months to finish.)

Current status

Modern Fortran

After almost three years of intensive coding, the modern Fortran version of PRIMA has been finished by December 2022.

MATLAB

Python

  • The inclusion of PRIMA into SciPy is under discussion. It will replace the buggy and unmaintained Fortran 77 version of COBYLA underlying scipy.optimize.minimize, and make the other four solvers available to all SciPy users.
  • My ultimate objective is to have a native Python implementation of PRIMA independent of Fortran, similar to what we have done with NEWUOA in MATLAB as mentioned above.

Other languages

  • Interfaces for using the modern Fortran implementation in other languages will be available later.
  • Given the modern Fortran version, native implementations in other languages become much easier, because we now have a structured and modularized implementation as a reference. My team will implement the methods in other languages in this way. I am exploring how LLMs like ChatGPT can accelerate our progress in this process.

What is needed now?

Fortran

  • Make PRIMA available with FPM.
  • Generating documentation using FORD or other tools.

Python

How to include PRIMA in SciPy as soon as possible? This is the question. The major Scipy maintainers are positive about the inclusion of PRIMA solvers in SciPy. See the discussions on GitHub for details. However, I do not know Python at all. So community efforts are greatly needed here.

The first step should be replacing the Fortran 77 implementation of COBYLA in SciPy with the PRIMA version, respecting the current Python signature of COBYLA. The second step will be getting BOBYQA into SciPy, as this solver has been long requested by SciPy users. The third step is to get the other three solvers included.

Previously, my former student Tom @ragonneau and I developed Python interfaces for the Fortran 77 implementation of Powell’s solvers under the PDFO project. They may provide a good starting point to begin with.



I do not think there is any modern Fortran code with modules in the SciPy code base, or even any .f90 code.

References

[1] M. J. D. Powell, A direct search optimization method that models the objective and constraint functions by linear interpolation, In Advances in Optimization and Numerical Analysis, eds. S. Gomez and J. P. Hennart, pages 51–67, Springer Verlag, Dordrecht, Netherlands, 1994

[2] M. J. D. Powell, UOBYQA: unconstrained optimization by quadratic approximation, Math. Program., 92(B):555–582, 2002

[3] M. J. D. Powell, The NEWUOA software for unconstrained optimization without derivatives, In Large-Scale Nonlinear Optimization, eds. G. Di Pillo and M. Roma, pages 255–297, Springer, New York, US, 2006

[4] M. J. D. Powell, The BOBYQA algorithm for bound constrained optimization without derivatives, Technical Report DAMTP 2009/NA06, Department of Applied Mathematics and Theoretical Physics, Cambridge University, Cambridge, UK, 2009

[5] T. M. Ragonneau and Z. Zhang, PDFO: a cross-platform package for Powell’s derivative-free optimization solvers, arXiv:2302.13246, 2023

Remarks

  • LINCOA seeks the least value of a nonlinear function subject to linear inequality constraints without using derivatives of the objective function. Powell did not publish a paper to introduce the algorithm.
  • The paper [5] introduces the PDFO package rather than PRIMA. Nevertheless, it provides a good introduction to Powell’s methods.

Update: PRIMA is hiring!

In addition to the community efforts, I would like to recruit full-time postdocs and/or research assistants to work on PRIMA. He / she must come to my university in Hong Kong to work. Possible tasks include the following.

  1. Make PRIMA available as an FPM package. Automatic documentation of PRIMA with FORD or Sphinx.
  2. Developing Python interfaces / wrappers for the modern Fortran implementation of PRIMA, towards the inclusion of PRIMA into SciPy.
  3. Developing native implementations of PRIMA with the modern Fortran version being the reference. Interesting languages including Python, MATLAB, C++, Julia, and R.

For 2, ideally, the implementation can be obtained (semi-)automatically using scripting languages and maybe LLMs.

The postdoc must have a Ph.D. degree in computational/applied math, computer science, or related domains. The research assistant must have a Master’s degree similarly.

If you, your friend, or your student are interested, please contact me.

20 Likes

I wrote several Python bindings for Fortran projects, you can checkout e.g. how I created the Python bindings for Minpack:

Happy to provide guidance on this topic.

3 Likes

Good job.

Are you planning to include Powell’s “Minimize a sum of squares, derivatives not needed” routine that he developed in the 1970’s (available from the Harwell Subroutine Library HSL, routine VA05). I (and I think many others) have used it for many years and found it useful and reliable. It is unconstrained.

1 Like

@zaikunzhang thank you for this very important effort! Your post made it to the front page of Hacker News: Optimization Without Derivatives: Prima Fortran Version and Inclusion in SciPy | Hacker News.

5 Likes

Hi @awvwgk ! Thank you very much for offering to help.

If I understand things correctly, four files are needed to wrap a Fortran subroutine so that it becomes callable in Python (e.g., cobyla.f90), right?

Given my limited memory of C and zero knowledge of Python, I do not think my implementation of the wrapper would be ideal, even with your kind guidance.

I have some funding available if someone here would like to undertake the project of getting PRIMA into SciPy, although the person must come to work at my university (the Hong Kong Polytechnic University), as the funding cannot be used online/abroad. I can open a research assistant or postdoc position for the project.

Thank you @davidk for your comments.

I checked HAL VA05 and its license. It seems not in the public domain. Therefore, I do not think PRIMA will include it. The situation is similar for VA24, Powell’s famous conjugate-direction method. It is also derivative-free, but it is not in the public domain either.

Thank you @certik for your encouragement!

I cherish the following comment from you. You posted it both on Hacker News and here under the LFortran thread.

I believe that you and the LFortran project are truly turning the tide on Fortran. I wish to see the production release of LFortran as soon as possible. (I have edited my post to include a link to the homepage of LFortran)

If PRIMA can also contribute to “turning the tide on Fortran”, I will feel honored. All the pain, depression, and frustration that I have experienced in the past three years will then be partially compensated.

Let us the community work together to get PRIMA included in SciPy and see whether the tide on Fortran will improve.

Many thanks.

1 Like

Hi @zaikunzhang awesome initiative! Here my grain of salt:
Regarding the approches to interface fortran to python I would say you have 3 main alternatives:

  • f2py … I would personally discourage it at any cost, it might seem tempting but it will put at lot of constraints on your library
  • using the CFFI interface as @awvwgk suggested
  • using the ctypes library

The last two are rather similar. They offer a bridge between data types. Using the last option you would need the following:

  • on your Fortran library you could add a module file containing just interfaces which use iso_c_binding to declare the input/output variables just as in the example from @awvwgk
  • a build system strategy to create a shared library (dll/so). You could do this typically with CMake
  • a python project wrapper which calls the functions available in the previously compiled shared library: for this step you can use numpy.ctypeslib.load_library

With this strategy you wont need the header file and you can decouple your Fortran code from the python package. This can be seen as an advantage or disadvantage depending on your project’s needs.

1 Like

@zaikunzhang I wrote a Python wrapper for a gradient-based optimization library written in Fortran using the last alternative (ctypes). You can take a look at GitHub - ofmla/seiscope_opt_toolbox_w_ctypes: This repo shows how to use SEISCOPE optimization toolbox Fortran code from Python The repo has some basic unit tests that check the results from optimization subroutines called from Python are equal to those obtained by demo Fortran programs.

3 Likes

I used ctypes al lot to call Fortran or C procedure from Python, it is quite simple.
What would be very nice, but more complex, I’m afraid, is to have the same behaviour of the scipy function quad. One have to ask guidance to people working in scipy.

I mean being able to pass a callback that has been compiled with Cython or numba and that is running without going back to Python. See:
https://numba.readthedocs.io/en/stable/user/cfunc.html

That would be very valuable as numpy could be very fast with linear algebra but then everything will slow down once one has a problem involving callbacks like ODE, optimization, etc. At least, this is my impression.
Whenever I need something like that I typically switch to implement it in Fortran and generally I can achieve a speedup of about one hundred (it depends on the problem of course).

1 Like

Dear @zaikunzhang, I have followed your work on PRIMA for a while and am very impressed by what you have achieved! I’m glad to see that the discussions about re-licensing were successful!

I am the maintainer of the estimagic package for nonlinear optimization. In estimagic we wrap algorithms from scipy, nlopt and others and add a few convenience features.

I have little experience with Fortran but a lot of experience in Python, particularly when it comes to numerical optimizers. If you think, this could be helpful I would be happy to discuss how I can help with the inclusion of Prima in scipy.

My feeling is that it would be best to discuss with the scipy maintainers which approach they prefer for the Python bindings (f2py, CFFI or ctypes) and hopefully they could point us to an example of how they compile Fortran code inside their build process. Then we could work together on everything that has to be done in Fortran (e.g. writing the module file @hkvzjal mentioned) and I (and possibly some others from the estimagic team) could work on the Python side.

(To clarify, we do not need any funding and would be happy to do this in our free time)

1 Like

@zaikunzhang , I just joined this forum because I happened to read this thread. I may be able to help you in implementing pure Python version of PRIMA. In my undergrad days, I used FORTRAN77 a lot. Today, I use Python in my profession (optimization algorithms). I don’t know modern Fortran but I guess I can catch up.:blush:

1 Like

Thank you @hkvzjal for the detailed explanation. All these sound interesting and reasonable to me. I suppose understanding the details would necessitate trying and getting hands dirty. I will keep them in mind.

Hi @ofmla , thank you for sharing. I had a look at ofmla/seiscope_opt_toolbox_w_ctypes. It is very impressive. The solvers included are very classical and powerful. The package will be a good reference for developing Python interfaces of PRIMA using ctypes. Many thanks!

1 Like

Hi @janosg ! Welcome to join the Fortran community!

Thank you very much for offering to help. Sorry for the delayed reply.

I have been following your project OpenSourceEconomics/estimagic as well. It is quite impressive.

I fully agree. Whatever we do for getting PRIMA into SciPy will be subject to review and approval from the SciPy community. It is better to collect their opinions before starting. I will make a post under the GitHub issue of SciPy. Feel free to join the discussion.

Thank you very much for the voluntary help, which is highly appreciated. Let’s see how to organize the collective efforts of the community, including the offers from @Ito and some other friends on the discourse. Many thanks to all of you!

1 Like

Welcome @ito ! Thank you very much for offering to help!

I hope the modern Fortran code of PRIMA is understandable even for those who are not familiar with modern Fortran. The syntax and style are extremely close to MATLAB. Indeed, I purposely refrained from using Fortran-specific constructs during the development, so that the code is easy to be translated into other languages. At least, it is easy if the target language is MATLAB.

I was thinking about developing some tool to translate the code automatically, or semi-automatically with minimum postprocessing that will be done manually. This is indeed how we obtained the MATLAB implementation of NEWUOA, one of the five solvers in PRIMA. Most of the MATLAB code was generated by some Ocaml scripts. The code was then post-processed using some bash script and by hand. Do you think this is possible for Python?

Many thanks again!

The latest update from the SciPy community:

Update from the community call: there is interest in adopting this implementation, but little appetite for taking on more Fortran code (because we’re already working to get rid of it in several other components). A feasible route forward could be to translate the Fortran implementation to C, using e.g. f2c, and to use that. Or perhaps there is interest in doing an official Cython / Pythran / C / C++ port?

I have opened a new thread about this because the viewpoint and trend reflected by this message should receive more attention from the Fortran community in general.

A comment on reddit:

Finally, a fortran package not written in Fortran77 but one trying to be approachable and maintainable rather than the product of unhappy PhD students spending many months to eke out a little more in an incomprehensible, Oracle-comparable mess.

Correct or not, this reflects to some extent the general image of Fortran in the programming community. If you do not agree with this image, you may post under the reddit thread.

With pleasure!

If you want to start getting your hands dirty you could start with a very simple example. I reworked a minimum working example previously discussed here

Since I work on windows but also have linux through WSL I prepared it such that you can test the following combination A) windows with ifort B) Linux with gfortran. Also, let’s assume that for simplicity you have everything in the same folder:

fortran2python.f90
module fortran2python
  use iso_c_binding
  implicit none
  contains
    subroutine say_hello(X,dim,num_points) bind(c)
#if __INTEL_COMPILER
      !DEC$ ATTRIBUTES DLLEXPORT :: say_hello
#elif __GNUC__
      !GCC$ ATTRIBUTES DLLEXPORT :: say_hello
#endif      
    ! Variables
      integer(c_int), intent(in) :: num_points
      integer(c_int), intent(in) :: dim 
      real(c_double), intent(inout) :: X(dim,num_points) !! array of coordinates
      
      integer :: i
      !-------------------------------
      print *, 'hello from fortran'
      
      print *, 'Whole array: ',X(:,:)
      
      do i = 1, num_points
          print *, 'by data point', x(:,i)
      end do
      
      x(1,2) = 2*x(1,2)
      
  end subroutine
  
  end module

test.py
import os
import numpy as np
import ctypes

folderpath = '' 
if os.name == 'nt': #windows
    DLLname = 'fortran2python.dll'
elif os.name == 'posix': #Linux/Mac
    DLLname = 'libfortran2python.so'
api = np.ctypeslib.load_library(DLLname,folderpath)

X = np.array([
    [1.0, 2.0],
    [3.0, 4.0],
    [5.0, 6.0],
])

print('X shape:',X.shape)
print(X[0,0],X[0,1])
print(X[1,0],X[1,1])
print(X[2,0],X[2,1])

api.say_hello(  ctypes.byref(np.ctypeslib.as_ctypes( X )) ,
                ctypes.byref(ctypes.c_int(X.shape[1])) ,
                ctypes.byref(ctypes.c_int(X.shape[0])) 
             )

print('back in python')
print(X[0,0],X[0,1])
print(X[1,0],X[1,1])
print(X[2,0],X[2,1])

print('hello')

With those two files you can now compile the dynamic link library and run the test from python as follows

A) Windows + ifort

ifort -fpp -O3 -c fortran2python.f90
ifort -dll -exe:fortran2python.dll fortran2python.obj
python test.py

B) Linux + gfortran

gfortran -cpp -O3 -fpic -c fortran2python.f90
gfortran -shared -o libfortran2python.so fortran2python.o
python test.py

You can learn more here: Managing libraries (static and dynamic libraries) — Fortran Programming Language

few details to notice:

  • Had to use the -fpp/-cpp pre-processors because of these key lines:
    subroutine say_hello(X,dim,num_points) bind(c)
#if __INTEL_COMPILER
      !DEC$ ATTRIBUTES DLLEXPORT :: say_hello
#elif __GNUC__
      !GCC$ ATTRIBUTES DLLEXPORT :: say_hello
#endif      

intel and gnu compilers use different decorators, so in order to be cross-compiler compatible you will have to use this in your interface functions.

As mentioned in the page I linked before, the ‘-fPIC’ flag is necessary with GNU compiler in order to enable Position Independent Code, which is needed make the collection of functions in the library callable in any order.

Once you get comfortable with this interfacing with a very simple example you’ll have to put in the mix the “build system” part, you can go for CMake or meson (I’m most familiar with the first but the latter seems to be even easier)… As was said previously the SciPy team can help with that decision. Anyhow, at the end your work will be to provide the binary compiled library that would be as least intrusive as possible in their architecture. Since calling N dynamic (or static) libraries is basically how most applications work now a days it wont (shouldn’t) be a problem.

2 Likes

That looks suspicious. Should that be #elif __GNUC__?

1 Like