My research group is initiating a project that aims to provide templates for a class of numerical optimization algorithms. (Disclaimer: “numerical optimization” is an overwhelmingly vast area, and we will primarily work on only a quite special class of algorithms, namely those falling in the same category as Powell’s methods without using derivatives; the implementations of such algorithms are notoriously complicated and involving.)
The motivation is to provide standard templates and building blocks for the implementation of some classical/mature algorithms, paving the way for extending the existing methods and developing new ones. We are inspired by Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, where the numerical linear algebra (NLA) community develops templates for basic NLA algorithms such as CG. By doing so, we practice the principle that coding means not only to get things done but also to document human knowledge (see the blog by @everythingfunctional titled Communicating with Whom?). Note that our objective is not at all to develop yet another optimization package. Instead, we aim to open the black boxes of optimization packages, show the standard implementation of such boxes, and provide standard modules for building new boxes.
Let me elaborate a bit on the word “template”. Here, a template for an algorithm refers to a combination of the following.
A piece of pseudocode for the algorithm.
A modulized and standardized implementation of the algorithm that faithfully corresponds to the pseudocode. The implementation should be easily readable and as similar as possible to the mathematical formulation of the algorithm. The purpose is to illustrate how to implement the algorithm for general problems. The code should be ready to use (of course) and can solve general problems of moderate size, but the performance is not the top priority. Others should be able to code the algorithm following this illustration if they need to adapt the algorithm to their particular problems.
Subroutines that can be used as building blocks for implementing similar algorithms. The subroutines must be reliable (of course) and standardized, yet need not be optimized. Others can use these subroutines for a quick prototype of algorithms that they want to study/develop. The subroutines should follow the principle of high cohesion and low coupling. Each of such subroutines should have a clearly defined objective that is sufficiently elementary and sufficiently general (e.g., solving an upper triangular linear system, finding the Lagrange interpolatant of a function on a given interpolation set, etc), so that others can easily replace them with a new version when needed.
Why are such templates relevant — and indeed important?
While there are tons of papers on numerical optimization algorithms being published, not all of them come with working open-source code. Some of them do, but the code is often dirty (sorry), unmodularized, and unstandardized. Authors of such papers/code should NOT be blamed, however, because writing research papers and possibly research-oriented code is substantially different from developing software, and we should NOT require one (group of people) to do all the jobs well.
As a consequence, researchers and users of these algorithms often find themselves doing the following things.
Spend a considerable amount of time on dirty and not-guaranteed-to-be-correct code by others.
Struggle to reproduce numerical results by others, being not sure whether the performance difference is due to the mathematics of the algorithms or due to tricks/mistakes in the implementation.
Reinvent/rebuild wheels because no standardized code or templates exist for the building blocks of the algorithms under consideration.
I myself have wasted too much of my life on these unpleasant and unproductive things, and I hope my project will help others (researchers/users) to save their time.
Now, here comes the big question and I would like to hear your opinion — what language(s) should be used for coding the templates? See points 2 and 3 in the elaboration on the word “template” given above.
It seems to me that (modern) Fortran is ideal for this job, not because I am posting in the Fortran discourse but because the following features of the language make it perfect for illustrative implementations of numerical algorithms.
The code is thankfully similar to the mathematical formulations — that’s the name of the language.
It supports matrices, vectors, and their operations natively. No need to write loops for basic matrix-vector calculations. This makes the code simple and readable.
It supports attribute specifications such as
parameter. The code can tell the readers (recall readability) clearly which variables are in/outputs, local variables, and constants.
It is strictly and statically typed. By reading the code, people can figure out what is the type, shape, and size of each variable.
Recall that the major purpose of the template is to illustrate how to implement the algorithm for general problems. The performance for particular/challenging problems is not the priority. Therefore, each of the abovementioned features is critical, yet the high efficiency of Fortran is only a plus but not included in the list.
There are other languages that can be candidates, particularly MATLAB, Python, and Julia. But MATLAB misses points 3&4, Python misses at least 4 (I am not sure about 3 because I barely know the language), and I know almost nothing about Julia.
Fortran has cons, of course.
One of the features that I dream about the most is for Fortran to support functions with multiple outputs. If Fortran had such capability, then the code would be more readable and more faithfully correspond to the math formulations. Of course, multi-output functions can be implemented as subroutines, but that is normally not how we formulate the functions mathematically — recall that we want the code to be as similar as possible to the math formulations. Another possibility is to return a structure, but it is again likely to differ from how we write the functions in mathematics — much more often than not.
People, including professionals and researchers, have a misconception that Fortran is out of date and difficult to learn/use. This is not the language’s fault, but it is a considerable disadvantage, an expulsive force for others to accept your work.
To be frank, due to the lack of interactivity, Fortran is more difficult to debug than scripting languages such as MATLAB and Python. Surely, Fortran is not alone here. It applies to all compiled languages. Fortunately, LFortran @certik is solving this problem. When will we be able to use Fortran in a way as interactive as MATLAB? For example, the
keyboardcommand in MATLAB is extremely useful when debugging. When a Fortran programmer talks about the difficulty of the language, probably (s)he does not mean it is hard to write — it is indeed easy — but means it is hard to debug. Debugging takes a major part of coding in any language. Note that debugging does not only mean spotting programming errors such as out-of-bound arrays. That part can mostly be done by compilers, and Fortran has no disadvantage regarding it. When coding numerical algorithms, most of the debugging labor is to answer questions like the following (see the list below; how to insert a sublist under a list here?). They are more about finding mathematical mistakes in the code than locating programming errors. This kind of debugging demands the interactivity of the language (indeed, IDE for the language). Traditional debugging tools may help, but not always. The
keyboard-like functionality is a killer feature here. With
keyboard, you can easily examine what your code is doing, modify it, and continue to run it without recompilation. Tools like
gdbcan do similar things but are not yet as flexible/powerful/friendly as
- Why does the code run without any error but produce a wrong result?
- Why does the algorithm work in theory but not in computation?
- Why does a new version of the code produce a significantly different result from the last version while the change seems to be insignificant?
- Related to 3, most potential users of the code are scared by the fact that Fortran needs to be compiled before use. True, it is stupid to criticize a compiled language in this way, but note that human beings are timid, lazy, and often stupid animals. An extra step before seeing the code actually run will hinder half of the users; if that extra step reminds people about difficulties and uncertainties, the other half will probably escape, leaving only the most courageous ones (like you). What is your most primitive memory about “compilation”? I bet that it includes the hundreds of errors that your compiler spat/vomited to you after you hit the “enter” key to compile your very first program in your very first coding course. That is a nightmare about coding for most normal human beings. Acknowledge or not, it is scary.
What is your opinion? The views of Fortran experts like you will be highly helpful for my project.
[Update: Indeed, what we are looking for is a lingua franca for numerical computation — a language that is simple, straightforward, and strict. Fortran was deemed in that way, but evolution is needed for it to continue to be regarded as such. See posts by @FortranFan under Fortran returns to top 20 TIOBE index, Eliminate implicit mapping, and Fortran and Neural Networks.]
One more thing: Due to the above cons, I feel obliged to accompany each template coded in Fortran with a MATLAB/Python translation. Even though the latter two languages cannot describe the algorithms as precisely/clearly/strictly as Fortran, most people may choose to read/try them rather than Fortran. But it raises one more question: What is the best way of translating a piece of modern-Fortran code to MATLAB/Python (and vice versa)? Given the high similarity between modern Fortran and MATLAB, shouldn’t there be an automated engine for doing so? Of course, the translated code should be readable — it is translated for human beings, not (only) for machines.
Thank you very much.