@jacobwilliams and others,
nlopt looks like a great optimization tool, but its fortran interface currently is Fortran 77. Could be updated to use
@jacobwilliams and others,
Yeah, some of that is Fortran 77 code that was converted to C and then a Fortran 77 wrapper written for it. If a Fortran version of something exists, the right way is to modernize it and use it. No need for C to be involved. Improvements can be made to the modernized Fortran code.
I’m also not a fan of writing Fortran 90 interfaces to a pile of Fortran 77 subroutines, as though they are our ancient legacy that must never be changed. But maybe I’ve mentioned that before.
You might be interested in checking out the great work done by Sebastian @awvwgk
The native NLopt Fortran bindings are really only the bare minimum, i.e. they contain only the enumerator values, not even explicit interfaces, which makes them rather fragile to use.
As Emanuele @epagone pointed out, I already went through all the work to create interfaces to the actual C API of NLopt and wrap them nicely in a derived type to make the overall user experience more pleasant (announced in NLopt Fortran bindings). I also put some work into making the Fortran bindings compatible with all NLopt versions since 2.0.0, because not all package managers ship the latest version. Feedback and contributions are of course very welcome.
If you want to see an actual application of the Fortran bindings to NLopt, I recently pushed dftd4/dftd4-fit, which glues together a couple of fpm projects to a small command line optimization driver.
Not to undermine Sebastian’s large effort, however, I would like to mention that I also attempted to create a modern Fortran interface back in 2019. I even tried to submit it as pull request ([WIP] A Modern Fortran Interface by ivan-pi · Pull Request #233 · stevengj/nlopt · GitHub) to the original NLopt repository. Preceding my own attempt, GitHub member aitap described how to wrap NLopt in a blog post (Wrapping C opaque pointers in Fortran 2003 in type-strict way).
The factors blocking my PR at the time were CMake integration in the upstream project and also some struggles with memory leaks and interfaces in the object-oriented Fortran wrapper (Actual and dummy procedure argument mismatch - Error 7062 - Intel Communities).
Since some NLopt users already depend upon the “old” (standard-violating…) interface, I think it would be difficult to have a modern interface co-exist in the original NLopt development repository. I also had the feeling the maintainers (and creators) of NLopt aren’t particularly interested in modern Fortran as a contemporary programming language.
Hence, I ultimately closed my PR and switched to using @awvwgk’s wrappers instead of my own (incomplete) version. AFAICT the
nlopt-f interface is completely type-safe and agnostic of the fact the underlying library is in C. It only assumes that you have
nlopt installed somewhere.
My takeaway from this whole experience is that fragmentation of the community can really hinder progress. Another lesson I learned is the importance of “finishing” a project to a level where others can use it safely.
I would add a note for future Fortran optimization libraries (cc @zaikunzhang), if they could provide a
nlopt-f callback compatible interface, it would make switching between optimization libraries much easier for consumers. The current interface expected in
function nlopt_func_interface(x, gradient, func_data) result(f) import :: c_int, c_double, c_ptr implicit none real(c_double), intent(in) :: x(:) real(c_double), intent(inout), optional :: gradient(:) class(*), intent(in), optional :: func_data real(c_double) :: f end function
If you already have an optimization library with different callback conventions for the objective function and parameters, you can employ the adaptor pattern to offer consumers a type-safe interface. The same pattern is in fact used in
nlopt-f to interface with the original library in C.
For even wider consumption potentially through NLopt itself (if the maintainers agree upon it), your optimization algorithm (as a Fortran subprogram) would need to export an interface similar to:
nlopt_result cobyla_minimize(unsigned n, nlopt_func f, void *f_data, unsigned m, nlopt_constraint *fc, unsigned p, nlopt_constraint *h, const double *lb, const double *ub, /* bounds */ double *x, /* in: initial guess, out: minimizer */ double *minf, nlopt_stopping *stop, const double *dx);
This however means you agree to implement all the possible stopping criteria expected by the NLopt API (see nlopt_stopping or nlopt_result), as well as some further conditions upon the constraints (e.g. NLopt expects the objective function will never be evaluated outside of the bounds) among other things.
Good point, I opened an issue for discussing the ad hoc choice of callback in
nlopt-f: Callback function design · Issue #16 · grimme-lab/nlopt-f · GitHub. If we can agree on a community standard for callbacks in optimization libraries, I would be happy to support it in
nlopt-f as well.
At least in the fpm ecosystem, a way of formalizing such community standards for callbacks would be to offer them in a module of abstract interfaces.
(As a side question, when did Fortran introduce abstract interfaces?)
abstract type SciMLProblem end
All types of problems, differential equations, PDE’s, linear systems, nonlinear systems, etc. are all sub-types of this. It is really impressive how they connected all the pieces with a huge hierarchy. For each abstract sub-problem, there will be a method looking something like
function solve(prob::AbstractProblem,args...,kwargs...) # . end
Using multiple-dispatch, the method will specialize based upon the problem, algorithms, parameters, and so forth.
Here is a concrete look of how it ends up on the consumer level:
using DifferentialEquations f(u,p,t) = 1.01*u u0 = 1/2 tspan = (0.0,1.0) prob = ODEProblem(f,u0,tspan) sol = solve(prob, Tsit5(), reltol=1e-8, abstol=1e-8)
For an optimization problem, you again have the same kind of interface:
using GalacticOptim, Optim rosenbrock(x,p) = (p - x)^2 + p * (x - x^2)^2 x0 = zeros(2) p = [1.0,100.0] prob = OptimizationProblem(rosenbrock,x0,p) sol = solve(prob,NelderMead())
The pattern works very well for them. It requires a shift in mindset from the classic Fortran interfaces with a subprogram (algorithm) that expects a callback function. In the Julia pattern you encapsulate both the problem and the algorithm. With some diligence and collaboration, a similar feat could be done in Fortran, building upon the same patterns exploited by Julia programmers.
Fortran 2003. The IBM Fortran site is one that marks such things clearly.
Thank you @ivanpribec for the kind reminder and informative elaboration. I fully agree that it is important for libraries to provide compatible interfaces so that users can easily switch from one library to another. This is particularly relevant for optimization due to the inevitability of passing (scalar- or vector-valued) functions.
For the moment, my development is strictly focused on the modernization of Powell’s Derivative-free Optimization solvers, which were originally coded in Fortran 77 by M. J. D. Powell FRS (Wikipedia). These solvers are also included in NLopt, an example being the
cobyla solver mentioned in your post. Currently, I concentrate more on the code (that does the real computation) instead of the interfaces. The current interfaces follow essentially the design by Powell with modernization.
For example, the interface I adopt for
cobyla at this stage is as follows. I will investigate more carefully the interfaces of existing libraries (particularly
stdlib) when I am about to finish the modernization of the code.
Click to see the interface
subroutine cobyla(calcfc, m, x, f, & & cstrv, constr, & & f0, constr0, & & nf, rhobeg, rhoend, ftarget, ctol, maxfun, iprint, & & xhist, fhist, chist, conhist, maxhist, info) !--------------------------------------------------------------------------------------------------! ! Among all the arguments, only CALCFC, M, X, and F are obligatory. The others are OPTIONAL and you can ! neglect them unless you are familiar with the algorithm. If you do not specify an optional input, ! it will be assigned the default value detailed below. For instance, we may write ! ! call cobyla(calcfc, m, x, f) ! ! or ! ! call cobyla(calcfc, m, x, f, rhobeg = 0.5D0, rhoend = 1.0D-3, maxfun = 100) ! ! See examples/cobyla_exmp.f90 for a concrete example. ! ! A detailed introduction to the arguments is as follows. ! N.B.: RP and IK are defined in the module CONSTS_MOD. See consts.F90 under the directory name ! "common". By default, RP = kind(0.0D0) and IK = kind(0), with REAL(RP) being the double-precision ! real, and INTEGER(IK) being the default integer. For ADVANCED USERS, RP and IK can be defined by ! setting __REAL_PRECISION__ and __INTEGER_KIND__ in common/ppf.h. Use the default if unsure. ! ! CALCFC ! Input, subroutine. ! CALCFC(X, F, CONSTR) should evaluate the objective function and constraints at the given ! REAL(RP) vector X; it should set the objective function value to the REAL(RP) scalar F and the ! constraint value to the REAL(RP) vector CONSTR. It must be provided by the user, and its ! definition must conform to the following interface: ! !-------------------------------------------------------------------------! ! !subroutine calcfc(x, f, constr) ! !real(RP), intent(in) :: x(:) ! !real(RP), intent(out) :: f ! !real(RP), intent(out) :: constr(:) ! !end subroutine calcfc ! !-------------------------------------------------------------------------! ! Besides, the size of CONSTR must be M, which is the second compulsory argument (see below). ! ! M ! Input, INTEGER(IK) scalar. ! M must be set to the number of constraints, namely the size (length) of CONSTR(X). ! ! X ! Input and output, REAL(RP) vector. ! As an input, X should be an N-dimensional vector that contains the starting point, N being the ! dimension of the problem. As an output, X will be set to an approximate minimizer. ! ! F ! Output, REAL(RP) scalar. ! F will be set to the objective function value of X at exit. ! ! CSTRV ! Output, REAL(RP) scalar. ! CSTRV will be set to the constraint violation of X at exit, i.e., MAXVAL([-CONSTR(X), 0]). ! ! CONSTR ! Output, ALLOCATABLE REAL(RP) vector. ! CONSTR will be set to the constraint value of X at exit. ! ! F0 ! Input, REAL(RP) scalar. ! F0, if present, should be set to the objective function value of the starting X. ! ! CONSTR0 ! Input, REAL(RP) vector. ! CONSTR0, if present, should be set to the constraint value of the starting X; in addition, ! SIZE(CONSTR0) must be M, or the solver will abort. ! ! NF ! Output, INTEGER(IK) scalar. ! NF will be set to the number of calls of CALCFC at exit. ! ! RHOBEG, RHOEND ! Inputs, REAL(RP) scalars, default: RHOBEG = 1, RHOEND = 10^-6. RHOBEG and RHOEND must be set to ! the initial and final values of a trust-region radius, both being positive and RHOEND <= RHOBEG. ! Typically RHOBEG should be about one tenth of the greatest expected change to a variable, and ! RHOEND should indicate the accuracy that is required in the final values of the variables. ! ! FTARGET ! Input, REAL(RP) scalar, default: -Inf. ! FTARGET is the target function value. The algorithm will terminate when a feasible point with a ! function value <= FTARGET is found. ! ! CTOL ! Input, REAL(RP) scalar, default: machine epsilon. ! CTOL is the tolerance of constraint violation. Any X with MAXVAL(-CONSTR(X)) <= CTOL is ! considered as feasible. ! ! MAXFUN ! Input, INTEGER(IK) scalar, default: MAXFUN_DIM_DFT*N with MAXFUN_DIM_DFT defined in the module ! CONSTS_MOD (see common/consts.F90). MAXFUN is the maximal number of calls of CALCFC. ! ! NPT ! Input, INTEGER(IK) scalar, default: 2N + 1. ! NPT is the number of interpolation conditions for each trust region model. Its value must be in ! the interval [N+2, (N+1)(N+2)/2]. ! ! IPRINT ! Input, INTEGER(IK) scalar, default: 0. ! The value of IPRINT should be set to 0, 1, -1, 2, -2, 3, or -3, which controls how much ! information will be printed during the computation: ! 0: there will be no printing; ! 1: a message will be printed to the screen at the return, showing the best vector of variables ! found and its objective function value; ! 2: in addition to 1, each new value of RHO is printed to the screen, with the best vector of ! variables so far and its objective function value; each new value of CPEN is also printed; ! 3: in addition to 2, each function evaluation with its variables will be printed to the screen; ! -1, -2, -3: the same information as 1, 2, 3 will be printed, not to the screen but to a file ! named COBYLA_output.txt; the file will be created if it does not exist; the new output will ! be appended to the end of this file if it already exists. Note that IPRINT = -3 can be costly ! in terms of time and space. ! ! XHIST, FHIST, CHIST, CONHIST, MAXHIST ! XHIST: Output, ALLOCATABLE rank 2 REAL(RP) array; ! FHIST: Output, ALLOCATABLE rank 1 REAL(RP) array; ! CHIST: Output, ALLOCATABLE rank 1 REAL(RP) array; ! CONHIST: Output, ALLOCATABLE rank 2 REAL(RP) array; ! MAXHIST: Input, INTEGER(IK) scalar, default: MAXFUN ! XHIST, if present, will output the history of iterates; FHIST, if present, will output the ! history function values; CHIST, if present, will output the history of constraint violations; ! CONHIST, if present, will output the history of constraint values; MAXHIST should be a ! nonnegative integer, and XHIST/FHIST/CHIST/CONHIST will output only the history of the last ! MAXHIST iterations. Therefore, MAXHIST= 0 means XHIST/FHIST/CONHIST/CHIST will output nothing, ! while setting MAXHIST = MAXFUN requests XHIST/FHIST/CHIST/CONHIST to output all the history. ! If XHIST is present, its size at exit will be (N, min(NF, MAXHIST)); if FHIST is present, its ! size at exit will be min(NF, MAXHIST); if CHIST is present, its size at exit will be ! min(NF, MAXHIST); if CONHIST is present, its size at exit will be (M, min(NF, MAXHIST)). ! ! Important Notice: ! Setting MAXHIST to a large value can be costly in terms of memory for large problems. ! For instance, if N = 1000 and MAXHIST = 100, 000, XHIST will take up to 1 GB if we use double ! precision. MAXHIST will be reset to a smaller value if the memory needed exceeds MAXMEMORY ! defined in CONSTS_MOD (see consts.F90 under the directory named "common"; default: 2GB). ! Use *HIST with caution!!! (N.B.: the algorithm is NOT designed for large problems). ! ! MAXFILT ! Input, INTEGER(IK) scalar. ! MAXFILT is a nonnegative integer indicating the maximal length of the "filter" used for ! selecting the returned solution; default: 2000 (a value lower than 200 is not recommended) ! ! INFO ! Output, INTEGER(IK) scalar. ! INFO is the exit flag. It will be set to one of the following values defined in the module ! INFO_MOD (see common/info.F90): ! SMALL_TR_RADIUS: the lower bound for the trust region radius is reached; ! FTARGET_ACHIEVED: the target function value is reached; ! MAXFUN_REACHED: the objective function has been evaluated MAXFUN times; ! MAXTR_REACHED: the trust region iteration has been performed MAXTR times, ! the value of MAXTR being 4*MAXFUN, which is UNLIKELY to reach; ! NAN_INF_X: NaN or Inf occurs in x; ! NAN_MODEL: NaN occurs in the model; ! DAMAGING_ROUNDING: rounding errors are becoming damaging. ! !--------------------------------------------------------------------------! ! The following cases should NEVER occur unless there is a bug, because the ! code will try to continue in the corresponding scenarios. ! NAN_INF_F: the objective function returns NaN or +Inf ! TRSUBP_FAILED: a trust region step failed to reduce the quadratic model ! !--------------------------------------------------------------------------! !--------------------------------------------------------------------------------------------------! implicit none ! Compulsory arguments procedure(OBJCON) :: calcfc real(RP), intent(inout) :: x(:) real(RP), intent(out) :: f integer(IK), intent(in) :: m ! Optional inputs integer(IK), intent(in), optional :: iprint integer(IK), intent(in), optional :: maxfilt integer(IK), intent(in), optional :: maxfun integer(IK), intent(in), optional :: maxhist real(RP), intent(in), optional :: constr0(:) real(RP), intent(in), optional :: ctol real(RP), intent(in), optional :: f0 real(RP), intent(in), optional :: ftarget real(RP), intent(in), optional :: rhobeg real(RP), intent(in), optional :: rhoend ! Optional outputs integer(IK), intent(out), optional :: info integer(IK), intent(out), optional :: nf real(RP), intent(out), allocatable, optional :: chist(:) real(RP), intent(out), allocatable, optional :: conhist(:, :) real(RP), intent(out), allocatable, optional :: constr(:) real(RP), intent(out), allocatable, optional :: fhist(:) real(RP), intent(out), allocatable, optional :: xhist(:, :) real(RP), intent(out), optional :: cstrv end subroutine cobyla
Regarding this interface, I do have a comment to make, and it may be relevant for
NLopt @awvwgk as well. In Powell’s design, the objective function and constraint function is evaluated by a single subroutine
calcfc. This may seem unnatural or even inconvenient at the first glance. I have never asked Powell why he chose this design, but I believe he had a reason to do so. My speculation is as follows. Powell designed his solvers for problems that cannot provide derivatives, aka derivative-free optimization problems. In such problems, the objective and constraints are commonly evaluated by sophisticated and expensive computation/simulation, often black boxes. It is not abnormal that the objective and constraints are evaluated simultaneously (e.g., during the same simulation) and cannot be decoupled trivially. Therefore, it is reasonable to have a single subroutine
calcfc that calculates both of them. Surely, we may decouple
calcfc into two (e.g.,
calfun for the objective and
calcon for the constraints) by an additional layer of wrapping, but we must avoid re-evaluation of
calcfc at the same point because such evaluations are expensive.
In the future, I will implement an interface that allows passing the objective and constraints by not only
calcfc , but also
[calfun, calcon] or
I’ve been using the NLopt port of cobyla in my own work.
The design where both the objective function and constraints are evaluated together does make sense also for efficiency. Parts of the computations may be reused for both purposes. I’m not familiar enough with optimization to say, whether the constraints and objective function must always be computed simultaneously.
In a similar manner, the NLopt interface already combines the objective function and its gradient. If the algorithm only wants to update the objective function, but reuse the gradient from the previous iteration, it can do so easily, by passing a null pointer for the gradient callback. This does require the user, to check for the null pointer in his callback:
if (present(gradient)) then gradient(1) = 3.0_wp * a * (a*x(1) + b)**2 gradient(2) = -1.0_wp endif f = (a*x(1) + b)**3 - x(2)
In this example a good compiler can reuse the temporary variable
a*x(1) + b. This might not be the case, if the user has to provide two callback functions. For a derivative-free optimizer, you just skip the gradient part in the callback completely.
Thank you @ivanpribec for the explanation. It is very helpful to hear how practitioners use and regard these algorithms.
However, there are some significant misconceptions here. COBYLA does not use derivatives (hence the name “derivative-free optimization”). That’s the whole point of the algorithm.
If you have (approximate) (sub-)derivatives available, it is wrong to use derivative-free algorithms. Otherwise, it is like fighting a tiger by bare hands while you have machine guns with you. As a researcher in this particular area, this is always the first thing I tell to those who want to try derivative-free algorithms. For more information about this (quite special) category of algorithms, I would like to refer to the Introduction section of the PDFO package and a talk I gave some time ago.
Powell’s COBYLA algorithm (I mean, the algorithm in the mathematical sense) uses the objective and constraint values always simultaneously. So there is no efficiency problem to evaluate them always simultaneously.
What you described seems to be a much-adapted version of COBYLA, but again, it is hardly a good idea to use it when you have any kind of (approximate) (sub-)derivatives available. There are so many much more efficient algorithms and packages.
I don’t think there is any misconception. My reply was only concerning the software interface. The example code with the objective function and gradient was just to highlight the mechanism by which gradient information is “requested” from the user. Putting the gradient- and function in the same callback is good because it means the compiler has a chance to optimize sub-expression reuse.
NLopt provides several algorithms for both constrained/unconstrained optimization, and derivative-free/derivative-available. In all cases, the low-level C library expects a function with the prototype:
double f(unsigned n, const double* x, double* grad, void* f_data);
Internally in NLopt, derivative-free algorithms will simply pass
NULL for the
grad array, and reuse the same interface. Users which do not know the gradient, will simply neglect the
grad dummy variable entirely.
I fully agree to decouple the evaluation of objective and constraints as much as possible, provided that we are talking about algorithms aiming at problems that can naturally evaluate the objective&constraints separately and can potentially provide derivatives. Things should not be coupled when they can be naturally decoupled.
Indeed, the current version of PDFO asks for separate evaluations of objective&constraints. We thought it was a convenient design. However, it turns out that Powell’s design is better for many problems that really need Powell’s algorithms, because objective&constraints are often impossible to be evaluated separately in those problems. That is why in future versions we plan to support more flexible ways of passing the objective&constraints.