Speed comparison of Fortran, Python, and R

Codex continues to improve, and the cost of programming has fallen dramatically, so numerical algorithms available in Python, R, and Matlab/Octave should be available in modern Fortran and not just FORTRAN 77. I had Codex replicate in Fortran more than 100 algorithms for finding changepoints in mean, variance, regression coefficients, and other statistics, checking that the Fortran code gave the same results as the Python or R package.

The Fortran implementations are much faster than those in R and especially in Python:

summary detector time by language (s)
language  steps    total   mean  median  geomean  share  avg_rank
Python       42  347.769  8.280   1.987    2.482  0.635     2.071
R            86  163.418  1.900   0.609    0.740  0.298     2.035
Fortran     117   36.629  0.313   0.171    0.172  0.067     1.043

R has task views for domains such time series (which includes changepoints) and finance that I may try to replicate in part.

SciPy is removing Fortran code for non-technical reasons, as has been discussed here. For one algorithm in SciPy, fitting a Student t distribution to data, a Fortran code is about an order of magnitude faster than SciPy, including when Python calls a Fortran DLL.

All SciPy algorithms should also be available in Fortran.

10 Likes

Thanks for sharing these.

This has been my experience, too.

1 Like

Things have been said like " Back in 1958, FORTRAN was going to eliminate coding by enabling mathematicians and statisticians to write their own programs."

My perspective on that is more that Fortran was aimed at eliminating the need for a specialized intermediary coder to translate technical problems for the computer, not at eliminating the need for humans to program; or alternatively that it is good to be lazy (If you can make something complicated into something simpler that is usually a good thing, and makes the task accessible to more people – and makes more people want computers, which was certainly appealing to IBM at the time).

So capabilities like those provided by codex look to produce huge gains in efficiency while still writing code that can be reviewed by a domain specialist not necessarily a programmer. And Fortran appears to be such a language that is relatively intelligible to humans,
yet scalable to HPC cluster-sized problems (coarrays, OpenMP, MPI, …)

So is this a step forward in realizing Fortrans’ original goals (or at least one one I stated it might have had), or does this evolve into codex generating machine code (there is really only one language on a platform, afterall) directly from prose or task descriptions? Is this a convergence of technologies that happens to find Fortran in a position to blossom, or for all high-level programming language to be superceded?

So when can I just describe a problem and tell codex to create an executable that solves it efficiently; and if so how does that compare
to Fortran?