Great questions. Similar questions have been nagging me for a while now, should we encapsulate the problem, the solution, the solver, and how do we make them composable? There are similar dichotomies in writing object-oriented libraries for linear algebra, and ODE solvers. Do you make the matrix the derived type, or the factorization? Unfortunately, I haven’t yet reached a conclusion which is better.
Julia, which has arguably reached the highest degree of composability, follows a pattern where they have a solve method that gets passed a problem and a solver, something like:
# ODE Problem
using DifferentialEquations
f(u,p,t) = 1.01*u
u0 = 1/2
tspan = (0.0,1.0)
prob = ODEProblem(f,u0,tspan)
sol = solve(prob, Tsit5(), reltol=1e-8, abstol=1e-8)
# Optimization
using Optim
rosenbrock(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2
result = optimize(rosenbrock, zeros(2), BFGS())
As you can see, for any abstract mathematical problem, the pattern they adopt is similar to:
result = action(problem, args, solver)
Since the language is dynamic this works great because the result type can carry all the output state. In Fortran I have the feeling it’s better to constrain output to intrisic types and arrays, perhaps merging solver statistics into a derived type to reduce the need for long lists of dummy variables.
If I understand correctly, you’re suggesting an interface like:
type(solver_type) :: solver
type(problem_type) :: problem
solver = bfgs() ! optionally pass solver settings like stop criteria
problem = problem_type(objective,gradient)
call solve(problem,initial_guess,solver) ! <-- here lies the dichotomy
! the solve method could be actually bound to the problem as in
call problem%solve(initial_guess,solver)
! or, consumed by the solver
call solver%solve(problem,initial_guess)
The original C++ library uses the second pattern, i.e.
cppoptlib::LbfgsSolver<FortranObjective<double>> solver;
solver.minimize(f, x);
we initialize a solver and use it to minimize the objective/problem (defined beforehand).
I guess you could call these two oppoising viewpoints solver-centric versus problem-centric. Julia libraries adopt the problem-centric viewpoint. The parameters enter this whole picture as a third design point. I consciously choose to avoid encapsulating the parameters (both problem- or solver-related) so that the library could be used easily in a parallel context.
Here is an overview of the current wrapper design.
The problem type only contains the callbacks:
!> A container for the callback objects
type :: problem_type
procedure(objective_callback), pointer, nopass :: objective => null()
procedure(gradient_callback), pointer, nopass :: gradient => null()
contains
procedure :: set => set_problem
procedure :: eval => eval_objective
procedure :: eval_grad => eval_gradient
procedure, pass(problem) :: solve => solve_problem
end type
The solvers are just integer “enumerators”
type :: solver_enum
integer(c_int) :: gd = 0 !> Gradient-descent
integer(c_int) :: newton = 1 !> Newton
integer(c_int) :: bfgs = 2 !> BFGS
integer(c_int) :: lbfgs = 3 !> L-BFGS
integer(c_int) :: lbfgsb = 4 !> L-BFGS-B
integer(c_int) :: cg = 5 !> Conjugate gradient
integer(c_int) :: cmaes = 6 !> CMAES
integer(c_int) :: neldermead = 7 !> Nelder-Mead
end type
type(solver_enum), parameter :: solver = solver_enum()
integer(c_int), parameter :: default_solver = solver%bfgs
The solve method prepares the adaptors and calls the C++ routine (hiding underneath a C interface)
subroutine solve_problem(problem,result,x,solver,params,istat)
class(problem_type), intent(in), target :: problem
real(wp), intent(out) :: result
real(wp), intent(inout), contiguous :: x(:)
class(*), intent(in), optional, target :: params
integer, intent(in), optional :: solver
integer, intent(out), optional :: istat
! TODO: investigate mimicking typing for solver argument
interface
subroutine cppoptlib_solve_d(solver,obj_fcn,grad_fcn, &
n,x,istat,udata) bind(c,name="cppoptlib_solve_d")
use, intrinsic :: iso_c_binding, only: c_double, &
c_int, c_ptr, c_funptr
integer(c_int), intent(in), value :: solver
type(c_funptr), value :: obj_fcn
type(c_funptr), value :: grad_fcn
integer(c_int), intent(in), value :: n
real(c_double), intent(inout) :: x(*)
integer(c_int), intent(out) :: istat
type(c_ptr), value :: udata
end subroutine
end interface
type(callback_adaptor), target :: cb
integer(c_int) :: istat_, solver_
solver_ = default_solver
if (present(solver)) then
solver_ = solver
end if
! The callback adaptor is essentially a container that stores
! pointers to the callback functions and parameters.
! It gets passed to the C routine as an void* pointer and later
! "cast" back in the actual callback routines on the Fortran side.
cb%obj => problem%objective
cb%grad => problem%gradient
if (present(params)) then
cb%params => params
end if
! TODO: bounded problems
! if (problem%bounded) ...
call cppoptlib_solve_d( &
solver=solver_, &
obj_fcn=c_funloc(cppoptlib_obj_callback_d), &
grad_fcn=c_funloc(cppoptlib_grad_callback_d), &
n=size(x), &
x=x, &
istat=istat_, &
udata=c_loc(cb))
! Handle status (error) flag
if (present(istat)) then
istat = istat_
else
if (istat_ < 0) then
print *, "cppoptlib: something went wrong"
return
end if
end if
! Evaluate objective for user convenience
result = problem%objective(x,params)
end subroutine
The C++ side first involves a class that wraps the Fortran callback adaptors:
template<typename T>
class FortranObjective : public cppoptlib::Problem<T> {
public:
using typename cppoptlib::Problem<T>::TVector;
using typename cppoptlib::Problem<T>::THessian;
T value(const TVector &x) {
// call Fortran objective callback
return obj_(x.size(),x.data(),params_);
}
void gradient(const TVector &x, TVector &grad) {
// call Fortran gradient callback
grad_(x.size(),x.data(),grad.data(),params_);
}
// Constructor
FortranObjective(obj_callback_d obj_fcn, grad_callback_d grad_fcn, void* params) :
obj_(obj_fcn), grad_(grad_fcn), params_(params) {}
private:
// TODO: handle callback type generically?
obj_callback_d obj_{nullptr};
grad_callback_d grad_{nullptr};
void* params_{nullptr};
};
The actual routine doing the work is effectively just a big switch to the various templated solver class instances, see below. I borrowed the pattern from the MATLAB wrapper. Compared with a native C++ driver program, the downside is that this leads to bloat of the binary executable as all solvers must be included. Perhaps link-time optimization could crop some of the dead branches, depending whether it can propagate the solver flag through the mixed-language interface?
void cppoptlib_solve_d(
int selected_solver,
obj_callback_d obj_fcn,
grad_callback_d grad_fcn,
int n,
double* x_,
int* ierr,
void* udata) {
// Create view of Fortran vector buffer
Eigen::Map<Eigen::VectorXd> x_view{x_, n};
// The minimize function cannot operate on a map, so we make a copy
auto x = x_view.eval();
FortranObjective<double> f{obj_fcn,grad_fcn,udata};
// Assume success
(*ierr) = 0;
enum solver_type {
GRADIENTDESCENT,
NEWTON,
BFGS,
LBFGS,
LBFGSB,
CONJUGATEDGRADIENTDESCENT,
CMAES,
NELDERMEAD};
// solve
// ----------------------------------------------------------
switch (selected_solver) {
case LBFGS: {
cppoptlib::LbfgsSolver<FortranObjective<double>> solver;
solver.minimize(f, x);
}
break;
case BFGS: {
cppoptlib::BfgsSolver<FortranObjective<double>> solver;
solver.minimize(f, x);
}
break;
case GRADIENTDESCENT: {
cppoptlib::GradientDescentSolver<FortranObjective<double>> solver;
solver.minimize(f, x);
}
break;
case CONJUGATEDGRADIENTDESCENT: {
cppoptlib::ConjugatedGradientDescentSolver<FortranObjective<double>> solver;
solver.minimize(f, x);
}
break;
case NEWTON: {
cppoptlib::NewtonDescentSolver<FortranObjective<double>> solver;
solver.minimize(f, x);
}
break;
case NELDERMEAD: {
cppoptlib::NelderMeadSolver<FortranObjective<double>> solver;
solver.minimize(f, x);
}
break;
default:
// TODO: Illegal input, print error
*ierr = -1;
break;
}
// over-write initial guess with result
x_view = x;
}