do not tell me about that… every time I am amassed there are still people in audience who do not understand that… and usually I teach phd students…
I don’t find that surprising at all. How would beginners to Fortran (and most likely, programming in general) know that before you teach it to them? For a language that works mostly like math, integer division is an exception.
If you’ve only ever had a math class, why would you have any reason to doubt that 1/2
equals 0.5
? It’s one of those intuitions/assumptions that people just do without realizing that computers can’t. Most people’s intuition would tell them that the type of the result of the expression 4/2
is an integer (2
), but the result of the expression 1/2
is a real (0.5
), without realizing that’s a intuition the computer doesn’t have.
Fun fact, the doubling rate of computer programmers in the world is roughly 5 years. That means that roughly half of all programmers have less than 5 years of experience. And that is likely to remain the case for another doubling or two (5 to 10 more years) at least.
The Quest for Speed - An Indicative Survey in Improving Performance implements the elastic-net regularized regression method (a form of variable selection used when there are many predictors) in Numpy, C++, and Fortran. GitHub code
Overall, roughly 2-3 times the speedup can be seen for the largest of cases with the use of
Fortran
orC++
which is not insignificant. A similar runtime pattern is seen through all languages showing a linear improvement in performance with their use. A closer look is recommended at the speedup sub-plots via the interactive link as they show that with the highest number of observations and dimensions, Fortran and C++ still show a speedup of 2.8x, and 1.95x respectively.
…
Summary
Rather significant improvements have been shown through the use of
Fortran
andC++
of between 2x-3x for even the largest of cases, with the worst being around 1.5x. The standard operations available within Fortran which helps in scientific computing, its higher performance in this study, and limited additional requirements shows thatFortran
should be considered for acceleratingEQ
, especially with more loop-heavy parts of the code, where, statically-typed languages such asFortran
andC++
can flex their performance advantages.
Below is a 2019 preprint with a similar theme. Can one write a Fortran 2018 linear algebra wrapping Lapack that is as high level and convenient as Matlab etc., without sacrificing speed? Lapack95 has short calling sequences but cryptic subroutine names, for example
- CALL LA_GESV ( A, B, IPIV=ipiv, INFO=info )
Solves a general system of linear equations AX = B.
and one could define an operator and write
X = B .div. A
The Matran project did this but has not been updated recently. The Matran page says pointer arrays will be replaced with allocatables “when the allocatable array extensions to Fortran 95 are widely and correctly implemented.” That happened years ago.
The Linear Algebra Mapping Problem
Christos Psarras, Henrik Barthels, Paolo Bientinesi
We observe a disconnect between the developers and the end users of linear algebra libraries. On the one hand, the numerical linear algebra and the high-performance communities invest significant effort in the development and optimization of highly sophisticated numerical kernels and libraries, aiming at the maximum exploitation of both the properties of the input matrices, and the architectural features of the target computing platform. On the other hand, end users are progressively less likely to go through the error-prone and time consuming process of directly using said libraries by writing their code in C or Fortran; instead, languages and libraries such as Matlab, Julia, Eigen and Armadillo, which offer a higher level of abstraction, are becoming more and more popular. Users are given the opportunity to code matrix computations with a syntax that closely resembles the mathematical description; it is then a compiler or an interpreter that internally maps the input program to lower level kernels, as provided by libraries such as BLAS and LAPACK. Unfortunately, our experience suggests that in terms of performance, this translation is typically vastly suboptimal.
In this paper, we first introduce the Linear Algebra Mapping Problem, and then investigate how effectively a benchmark of test problems is solved by popular high-level programming languages. Specifically, we consider Matlab, Octave, Julia, R, Armadillo (C++), Eigen (C++), and NumPy (Python); the benchmark is meant to test both standard compiler optimizations such as common subexpression elimination and loop-invariant code motion, as well as linear algebra specific optimizations such as optimal parenthesization of a matrix product and kernel selection for matrices with properties. The aim of this study is to give concrete guidelines for the development of languages and libraries that support linear algebra computations.
That’s not in the spirit of the original FORmula TRANslating system… Hence indeed a weakness.
instead, languages and libraries such as Matlab, Julia, Eigen and Armadillo, which offer a higher level of abstraction, are becoming more and more popular. Users are given the opportunity to code matrix computations with a syntax that closely resembles the mathematical description;
They are in the spirit of the original Backus’ Fortran…
I see that the linalg module of fortran-utils by @certik does this. It starts as follows:
module linalg
use types, only: dp
use lapack, only: dsyevd, dsygvd, ilaenv, zgetri, zgetrf, zheevd, &
dgeev, zgeev, zhegvd, dgesv, zgesv, dgetrf, dgetri, dgelsy, zgelsy, &
dgesvd, zgesvd, dgeqrf, dorgqr, dpotrf, dtrtrs
use utils, only: stop_error, assert
use constants, only: i_
implicit none
private
public eig, eigvals, eigh, inv, solve, eye, det, lstsq, diag, trace, &
svdvals, svd, qr_fact, cholesky, solve_triangular
! eigenvalue/-vector problem for general matrices:
interface eig
module procedure deig
module procedure zeig
end interface eig
! eigenvalue/-vector problem for real symmetric/complex hermitian matrices:
interface eigh
module procedure deigh_generalized
module procedure deigh_generalized_values
module procedure deigh_simple
module procedure zeigh_generalized
module procedure zeigh_simple
end interface eigh
! eigenvalues for general matrices:
interface eigvals
module procedure deigvals
module procedure zeigvals
end interface eigvals
! matrix inversion for real/complex matrices:
interface inv
module procedure dinv
module procedure zinv
end interface inv
! solution to linear systems of equation with real/complex coefficients:
interface solve
module procedure dsolve
module procedure zsolve
end interface solve
interface solve_triangular
module procedure dsolve_triangular
end interface solve_triangular
! determinants of real/complex square matrices:
interface det
module procedure ddet
module procedure zdet
end interface det
! least square solutions the real/complex systems of equations of possibly non-square shape:
interface lstsq
module procedure dlstsq
module procedure zlstsq
end interface lstsq
! construction of square matrices from the diagonal elements:
interface diag
module procedure ddiag
module procedure zdiag
end interface diag
! trace of real/complex matrices:
interface trace
module procedure dtrace
module procedure ztrace
end interface trace
! singular values of real/complex matrices:
interface svdvals
module procedure dsvdvals
module procedure zsvdvals
end interface svdvals
! singular value decomposition of real/complex matrices:
interface svd
module procedure dsvd
module procedure zsvd
end interface svd
! Cholesky decomposition
interface cholesky
module procedure dcholesky
end interface cholesky
! assert shape of matrices:
interface assert_shape
module procedure dassert_shape
module procedure zassert_shape
end interface assert_shape
contains
Bingo,
Foreign Function Interfaces that many/all scripting languages offer, means that one can handle data flow management/IO/visualization/database interfacing with something easier to work with, and let Fortran do what it does best.
I don’t think it is the agility of Python as a programming language itself, but the developed infrastructure of libraries, online documentation and learning resources, which enabled Bob to quickly find the Python kdtree in the first place.
Correct, but then again, why such infrastructure of libraries even exist? You can find Python libraries for literally everything, and in most cases it just doesn’t make sense. The exact same thing could be done much faster with a language that at least has a compiler. Don’t ask me why this is happening, it is beyond me. There is no logical explanation. I can only quote the well-known aphorism attributed to Einstein.
Flipping the coin of the debate…
There is no debate here, or at least there shouldn’t be any debate. Python is very good for scripting, and that’s it. Otherwise it’s one of the worst choices for scientific computing out there, period. End of the story.
I dislike C, and even more C++, which are obviously not designed for scientific computing. But if I only had two choices, either C/C++ ot Python for such kind of computing, I wouldn’t think a fraction of a second: I would pick C/C++ immediately - or literally anything else but Python. For obvious reasons. How those reasons seem not to be obvious for many others, again, it’s beyond me.
I agree. I think also that Julia or even Matlab are much better than Python for scientific computing. A good compromise is to use Julia or Matlab as user-friendly interface and do the heavy numerical calculations in Fortran. That said, Julia is reasonably fast in itself, Python is much slower.
The exact same thing could be done much faster with a language that at least has a compiler.
I think there’s a bit of a conflation here. The resulting code would execute “much faster” with a compiled language, but actually developing the code in the first place would not be “much faster”.
When I was first learning software development I loved Python because the feedback loop could be so much quicker. I could play around with a concept in the REPL, to figure out how I could/should write a little piece of code, write it, then load the module it was in and again just play around with it. No need to go through a write-compile-execute loop, no need for a testing framework, just “play” with the code. For data science type stuff being able to “play” with the data is also hugely beneficial.
I can absolutely see why Python became popular, but I see no reason other languages, even Fortran, couldn’t obtain the same level of interactivity with the right tooling. Haskell is a compiled language that also has an interactive REPL, and it was also hugely helpful when I was learning that language. LFortran has a REPL so it’s certainly possible. I think we could gain a lot of Python converts if we got things working well in this space.
Julia or Matlab as user-friendly interface and do the heavy numerical calculations in Fortran
The whole point of Julia is that Julia will function as a user-friendly interface while doing the heavy numeric calculations just as fast as Fortran (or faster if the expressive power gives you time to meta-program a super fancy algorithm).
What people see in practice matters, not mere goals, wishes, and exaggerations.
https://www.reddit.com/r/Julia/comments/18plt1h/machine_learning_frameworks_feel_sluggish_why_is/
The resulting code would execute “much faster” with a compiled language, but actually developing the code in the first place would not be “much faster”.
That’s correct, however I still think the first “much faster” outbalances the second. Admittedly, I never used Python for a really large project (because as far I know I am still sane,) but I did play with it enough to realize things. Additionally, I played a lot with Lua (I actually wrote a numerical methods package for it years ago) and I certainly played much more with Scilab (which is more “high level” than Python.)
Now, my experience says that, for serious projects at least, it always goes like that: You start with one of those (Scilab or Python or whatever similar) and you have indeed the tools to proceed faster… in the beginning - in the very beginning. If you want to try a new idea, you can have a working (and small) code that can be used to play around with your new idea. You have a plethora of ready-made libraries to use and you don’t even need to save the code, if you are using the REPL. The beginning is full of roses, basically the definition of convenience. So you have a code to use in, let’s say 1-2 hours, sometimes even in just a few minutes. Then you start to play around and see if your idea works. Yes, it might be slower but who cares, all you want is some results to confirm the idea works. Usually your idea doesn’t work right away, the code needs some modifications. And even if it does work right away, it definitely needs polishing - lots of it… Either way, this is when the reality of doing business in such a language strikes.
The code polishing becomes more and more demanding, needing more and more trial and error. And each try is much slower than what it could be. Granted, it might still be “fast enough” generally speaking, but even a one-second running time (which is typical for small projects) becomes tedious when you have to do it again and again. You realize you could do the trial-and-error phase literally orders of magnitude faster, if you used a compiled language. This is when you start thinking “ok, the idea works but the code itself still needs a lot of work, and I definitely won’t do that in this environment, let’s switch to Fortran”. At least that’s what I did several times.
Of course, if we are talking about small projects - or projects that the computational part is not very demanding anyway - you can leave it in Python or whatever else language you are using. I myself have lots of code in Maxima that could be run much faster if it was written in Fortran, but it’s good enough as it is and I don’t need to use it often anyway. So I never felt the need to rewrite the whole thing in Fortran. However, that’s the exception rather than the rule.
For large projects, you will typically need a compiled language sooner or later - and pray it is sooner than later because the more you delay it, the harder it becomes. So my conclusion is, you know what, I will need Fortran anyway, so I’ll start with it right away". Afterall, using Fortran from the very beginning is not that slower. Its array features are unbeatable, I have a pack of libraries backing me up (some written by me, some taken by one of the countless "FOOPACK"s.) The foundation of the new code is already there, it’s not like I am starting from scratch. This is where LFortran’s REPL can help as well, and this is one of the many reasons I like LFortran.
If what you are going to do seems like a serious project, the following inequality for total time needed holds:
time(Fortran right away) << time(whatever interpreted but convenient)
The former has a somewhat slower start, but it pays off soon. The latter usually starts great but you hit a “stall wall” pretty quickly. Of course, many people will disagree. I know people who did their whole Master’s in… Mathematica or Matlab, for Devil’s sake - and it included a hell lot of numerical computations. I tried to ask… “why”, and I quickly realized there is no way to talk them out. The typical answer, disarming in its naivety, was something like “today’s computers are fast enough, no need to bother with Fortran, C/C++, whatever”. Essentially they just avoid learning a real programming language, and rely on a convenient but interpreted and slow environment instead. There is no way to contradict such a “logic”, so nowadays I just nod in comprehension and walk away.
I don’t agree on the “Julia as fast as Fortran” statement. There is a paper that compares the performance of different programming languages and finds that Julia is about 50% slower than Fortran or C.
The paper is written by two economists who solve a basic real business cycle model (a workhorse model in macroeconomics) using Fortran, C, Matlab, Julia and other programming languages. It’s quite interesting and they find that Fortran and C are the fastest (no significant difference between the two) followed by Julia and then Matlab. Python is much slower. I attach the table with the run time comparison from the paper.
Note: the paper is from 2018, so its results might be a bit outdated
Link to the codes to replicate the paper: GitHub - jesusfv/Comparison-Programming-Languages-Economics: A Comparison of Programming Languages in Economics
See discussion on the julia discourse from 2016 about this benchmark A Comparison of Programming Languages in Economics - #58 by Elrod - Community - Julia Programming Language. The TLDR is that there were a number of pretty simple changes that give minor speedups and bring Julia on par (to slightly ahead?) of c++.
Any individual benchmark of similar algorithms will show a small benefit to either Julia Fortran or C++ (in the same way that any individual benchmark will show differences between GCC and Clang). These differences are very frequently ~50% and depend heavily on minute differences in the code, random chance (different compiles can randomly shuffle things around leading to cache differences), or how the compiler is feeling on that specific day. What’s more important than which language wins in a specific benchmark is that you can get top notch performance if you try, and get “good” (e.g. within 2x of optimal) performance without heroics.
I agree with all that, and it’s part of the reason I ended up sticking with Fortran, but
my experience says that …
it takes experience to understand this. Most programmers have less than 5 years experience. So most programmers don’t realise they’re going to hit that wall, so they go with what’s easier to start with.
The typical answer, disarming in its naivety, was something like “today’s computers are fast enough, no need to bother with Fortran, C/C++, whatever”. Essentially they just avoid learning a real programming language, and rely on a convenient but interpreted and slow environment instead.
In the late 80’s, computers were so slow that when you would discover Turbo Pascal, that excellent Borland’s compiler, you would not want to go back to BASIC (although some were good and structured in the same period, like GFA Basic. And there were even compiled BASICs like Turbo Basic.). The difference between interpreted and compiled languages was evident for everybody.
With nowadays computers, you don’t see the difference for small computations. And interpreted languages, especially Python, are often the first language learned. I learned Python around 2010 and really enjoyed it for “scripts” (which are just a name for “small” programs) because it is concise, has structures like lists, and with an easy to use standard library.
But if you want to go to the other side of the Earth, it’s faster to take plane than train… A tool for each application.
In the late 80’s, computers were so slow that when you would discover Turbo Pascal, that excellent Borland’s compiler, you would not want to go back to BASIC
Everybody liked Turbo Pascal, myself included. But B…B… Basic? I learned Assembly for the MOS 6510 just because of Basic. Even the good versions of it (BBC Basic, that is) were abysmally slow.
With nowadays computers, you don’t see the difference for small computations. And interpreted languages, especially Python, are often the first language learned.
I would argue here that I don’t see any reason for Python as the first language to learn. Why not Pascal, for example, as it used to be. Everybody forgot that one. It’s not harder to learn; it doesn’t lack features; it has excellent modern versions and compilers. Why not Lua, as another example, if you insist on interpreted languages. And I could mention many others here.
Why Python specifically managed to be so popular that you will find libraries written in it for everything, even for tasks Python is blatantly not suitable? But that’s not the real problem anymore.
The real problem nowadays is many newcomers learn Python and stay to Python. It’s not even “learn this one as an introduction, then switch to C/C++/whatever at some point in the future” anymore. It’s just Python… Python for everything, forever. I know people who do that and are ready to defend it with a passion. I’m not even sure they really know (or believe, or care) they could run their programs 100-1000 times faster if they used a decent programming language that has a compiler. The real problem is Python hits #1 (in TIOBE index, for example). An interpreted language is steadily #1. The Basic of modern era, I guess… Well, my reaction is “no, thank you”, even though some will look at me as if they stare at an exotic weird species instead of a normal person.
Everybody liked Turbo Pascal, myself included. But B…B… Basic? I learned Assembly for the MOS 6510 just because of Basic. Even the good versions of it (BBC Basic, that is) were abysmally slow.
I learned programming with BASIC on a Sharp PC-1211. No smartphone for kids in those days… The PC-1211 had a ~1.5 kB RAM and a 4 bits microprocessor at 256 kHz. I respect BASIC as my first step into programming: it was Beginner’s All-purpose Symbolic Instruction Code, and I was an absolute beginner (not yet a teen) and it did the job, introducing me in the computer world. In 1981, Kraftwerk was singing in its Computer World album:
I program my home computer
Beam myself into the future
Kraftwerk – Home Computer Lyrics | Genius Lyrics
https://www.youtube.com/watch?v=Ad-8gOMM9yk
BASIC did beam myself into the future. Well, it also quickly attracted me toward assembly language… (it was easy to learn on those old small microprocessors, and just still more basic than BASIC).
The real problem is Python hits #1 (in TIOBE index, for example). An interpreted language is steadily #1. The Basic of modern era, I guess…
Probably it was taught as a first language because it was interactive and with a simple and powerful syntax (a Python program can typically be 2x or 3x smaller than a C program). And there are even now pocket calculators with Python inside. The problem is that now it is #1, it will be hard to dislodge it before next decade…
When I was a student, Turbo Pascal was taught in France as a first language. Then it mysteriously disappeared despite Delphi (a great visual Pascal), which was quite successful but probably less than MS Visual Basic.
Probably Python became #1 due to a beam of causes, some of them accidental (like mammals replaced dinosaurs…).