What global minimization/maximization algorithms do you use?

Dear all,

A quick question,
What global minimization/maximization algorithms do you use?

Like, do you use simulated annealing (SA) often?
If so, just a quick question, in Corana’s famous SA paper,

He said on page 3,

Does the first new proposed move on the new temperature has to begin from the last x_opt?

It seems if so, it makes parallelization a little non-trivial here. Because even with many CPU cores, when new Temperature start, they have to start from the last unique x_opt. So, if someone try to use many cores and each core uses a shorter Markiv chain (so less parameter exploring per core, so total computing wall time can be decreased), it does not work well. Because all the cpus’ Markov chain are started from the same ``x_opt```, so they are more or less correlated which is bad because they are not exploring the parameters space enough.

PS.

I also see Goffe has f90 code,

Jacob Williams has a modern implementation,

Those are mostly based on Corana’s paper.

Another smaller illustration code can be found here,
https://courses.physics.illinois.edu/phys466/sp2013/ANNEAL/index.html
That might be mostly based on Kirkpatrick’s famous Science paper,
https://www.science.org/doi/10.1126/science.220.4598.671
since it use specific heat to indicate freezing has begun.

2 Likes

I have used various algorithms for finding “an” optimum. They all have their quirks, but I would say that you can do worse than using BOBYQA by M. Powell and derivatives or indeed the routines in MINPACK. I have experimented with probabilistic methods (simulated annealing, SCE, PSO, differential evolution and then some), but my conclusion was that their claims of achieving the global optimum are only approached if you have sufficient patience.

2 Likes

I have generally used my Genetic Algorithm, developed during my Ph. D. (25 years ago!), coupled to a physical model (simulating light propagation in optoelectronics devices).

I have sometimes used a Simulated Algorithm, but seemingly simpler that what you describe. The advantage of SA is that it can be very simple to program: a loop with a decreasing T used to compute the probability of accepting a solution. If there is only a few parameters in the problem, it can be sufficient.

It depends of course of the complexity of the “landscape” of the “function” you want to optimize, and the number of parameters.

1 Like

As with stochastic algorithm you can not be certain to obtain the global optimum, parallelization could consist to launch several Simulated Annealing algorithm, each one following its own trajectory in the “landscape”. And you take the best solution at the end (the best local optimum).

3 Likes

The documentation of the NAG library is informative even if you don’t have access to the software, since algorithms are described and references are given. You could look at section
Global Optimization of a Function.

e05kbf 28.3 nagf_glopt_handle_solve_mcs
Bound-constrained global optimization by multi-level coordinate search, using function values only
e05saf 23 nagf_glopt_bnd_pso
Global optimization using particle swarm algorithm (PSO), bound constraints only
e05sbf 23 nagf_glopt_nlp_pso
Global optimization using particle swarm algorithm (PSO), comprehensive
e05ucf 24 nagf_glopt_nlp_multistart_sqp
Global optimization using multi-start, nonlinear constraints
e05usf 24 nagf_glopt_nlp_multistart_sqp_lsq
Global optimization of a sum of squares problem using multi-start, nonlinear constraints

2 Likes

Since the word “convex” does not appear in the referenced ACM TOMS article, it is worth pointing out that for a convex problem, all local solutions are global solutions.

1 Like

Thank you all @Arjen @vmagnin @Beliavsky @themos
I appreciate all of your insights! I learned besides SA, there are also other algorithms such as particle swarm, genetic algorithm, BOBYQA.
There is a book

However the examples are all written in Julia. :sweat_smile:

Concerning Genetic Algorithm, you will find some Fortran programs on GitHub. For example:

1 Like

Interval methods are also worth looking into. In low dimensional spaces, they are often one of the fastest global optimizers.

2 Likes

The article on simulated annealing praises the Nelder-Mead algorithm for being quite robust and fast. That has prompted me to have a look at it again. On Netlib there was one implementation and I intend to modernise it a bit :wink: .

2 Likes

I can offer my experiences; when ever possible start with grid searches and plot contours of the objective function, look for pathological “crinkly bottoms” commonly seen fitting to noisy expt data, and if the objective function includes elements from a Monte Carlo simulation, planes consisting of 2 variances can contain very deep, narrow banana-shaped valleys.
I always run a random search routine and leave it running in the background, and leave it running for much longer than looks sensible, I posted a graph in our PhD office of one case where thousands of iterations had produced no obvious improvement, then suddenly there was a dramatic “new” better optimum found. This graph always raises the same question: why did you leave it running so long, did you already know the “new” solution existed and the answer is always the same: no, I did not know, but I was having a professorial moment and had forgotten it was still running.
The next step really depends on how many times you are going to run these programs and how expensive the objective function is, but the basics always apply. Do not believe an optimisation until you have found it multiple times starting from very different parts of the parameter space, always grid search the immediate vicinity, to check a much better solution is not lurking near-by. Be patient, be constructively cynical, and get lucky.
BTW I agree with @Arjen, a combination of simulated annealing and folding polygon have served me, in my domain, very well over the years.

3 Likes

Alan Miller’s site has several optimization codes. For Nelder-Mead there is minim.f90 with driver t_minim.f90, which compiles and runs with gfortran, ifort, and g95. For global optimization I slightly modified and uploaded Global_Optimization_Miller

  • global.f90 At Arnold Neumaier’s web site, this is recommended as the most successful of the global optimization packages. There is a sample program fit.f90 and the original documentation global.txt for the f77 version. I have included testfunc.f90 which will eventually contain all of Neumaier’s 30 test functions. N.B. Users of local optimization packages usually obtain satisfactory convergence after 10s or sometimes 100s of function evaluations. Global optimization routines usually require many 1000s of function evaluations.
2 Likes

John Burkardt has an implementation of the original Nelder-Mead code from Journal of Applied Statistics (asa47)

asa047

I wrapped his routines (and made some mods of my own) in a class. Note his version does not include a quadratic fit to improve convergence.

Also, for anyone looking for good references on optimization particularly the newer “nature inspired” methods, I can recommend the following books.

Yang is the originator of several popular nature inspired methods (Firefly, Cuckoo Search etc.)

3 Likes

Spoiled for choice :smiley:

1 Like

I had a closer look at Alan Miller’s implementation - it has both the Nelder-Mead method and the quadratic surface approximation. Nice. The interface is a bit complicated because you need to specify a lot of parameters but that can be amended by a small wrapper.

2 Likes

I’ve come across this article, which seems to suggest that “DE” and “CS” perform relatively good for their test calculations (but I guess the conclusion may also depend pretty strongly on the problem to be studied)

1 Like

Thank you all!

Perhaps it will be a good idea if we can have a Fortran package which contain various global or local optimization algorithms. Probably I will start to do this, LOL.

For me the only advantage of Julia is it has a differentialequation.jl which contains some latest ode/sde/dde solvers. Fortran also have most of them, it is just that we need more new algorithms implemented in modern Fortran.

1 Like

I have just rewritten my Nealder-Mead in fortran2018 :upside_down_face:

2 Likes

Here is a parallel version of PIKAIA,
https://whitedwarf.org/parallel/#sec5
Alan Miller also have a F90 version,
https://wp.csiro.au/alanmiller/

For genetic algorithm, it seems there is a PGAPACK,

2 Likes

If the function (either analytic or numerical) “behaves” (meaning it is differentiable in a given interval,) I normally use an algorithm I found in this paper (and others by the same authors, but this is the first and probably the most detailed one; a freely downladable version can be found here.) They describe the theory behind their algorithm in detail, but if you are in a hurry they also give you a pseudo-code (Pascal-like) which is easy to follow and easy to implement in Fortran - I did that years ago and use since then. The algorithm is able to find all the roots (or extrema) of a function within a user-specified interval - not just some roots/extrema, all of them.

I think the idea behind that algorithm is very clever: if you can, somehow, find the total number of roots/extrema within a given interval, then you can easily find those roots/extrema as well; all you need is just how many of them are located in the interval. Using Kronecker-Picard theory, computation of the total number of roots or extrema is deduced to a definite integral (I use QUADPACK for the integration itself but you can use whatever else; it’s not a “badly-behaving” integral.) Once you have that number, computing the roots/extrema is trivial and fast: divide and conquer.

The neat thing about this approach is that it shouts “I am highly parallelizable” from afar (see, e.g., this paper.) Furthermore, the method works very well with numerical functions (for example, a cubic spline based on computed data.) I used such an algorithm to compute whatever I needed for a specific task: from simple roots, maxima or minima to eigenvalues/eigenfrequencies.

The drawback is what you probably noticed right from the beginning: the function must be continuously differentiable (at least two times) in the interval you are looking for al roots or extrema.

3 Likes