The counter-intuitive rise of Python in scientific computing

Pap · December 29, 2023, 11:08pm

That was a little better than my Casio PB-410. I only had 1568 bytes to work with, so every single byte was precious. I’m sure you know what I mean. In my case the Casio had a counter showing the free RAM available, and each time you entered a new Basic statement the counter went down. This is why the “EXE” (Enter) key was the “horror key”.

Don’t get me wrong, Basic was not bad as an introductory language for those pocket machines. As far I know, there was only one “pocket calculator” that had a C compiler instead of Basic - but it was expensive, so for common mortals Basic was the only choice. I certainly learned a lot using basic on the little Casio. I dare say the limited memory also helped in a way - I learned quite a few programming tricks because I had to. But later on, when I had a C64, its Basic was so bad and so slow that you had no option but to find a good Pascal compiler or learn Assembly.

Same here. Pascal was taught at the first semester, and yet most of the students learned (or already knew) Basic on the side. And then Pascal mysteriously disappeared, as you said. I kept using Turbo Pascal for side projects until Fortran 90 was accessible.

Python is not the only language with those features - not even the first either. Haskell is also interactive, it is quite powerful, and has a compiler as well. It was introduced before Python, and yet it was never popular. And I’m sure there are others.

I think with the right marketing one could push even Intercal at #1. One of Intercal’s “features” is you have to say “please” on your code “often enough” (how often is of course not specified.) If your program doesn’t have enough “please” statements, it won’t compile because you are considered rude; if you add a lot of “please” statements, your code won’t compile either, because you are “excessively polite” (I am not kidding).

art-rasa · December 30, 2023, 10:08am

My first touch on any sort of programming was on an AVR microcontroller with 128 bytes of RAM, 2K of program memory and pushbuttons/LEDs as a user interface.

Edit: RAM amount

vmagnin · December 30, 2023, 10:50am

My father had a Casio FX-702P, which is still working. My Sharp PC-1261, bought in 1984, is still working too. Good old Japanese technology of the 80’s!
I remember having seen my father programming on a Texas Instruments with red LEDs at the end of the 70’s, maybe a TI-57 or something similar, and you had something like 50 “steps” (instructions) of RAM.
Learning programming in those days was of course not possible without simultaneously learning to optimize your code (both in size and speed). You had to be ingenious, no choice… Scarcity always feeds inventiveness. And the reflexes would be acquired for a life.

Yes, other languages with similar qualities could have had the destiny of Python. Don’t try to find THE reason. There are good reasons and contingent ones. Once upon a time Python was in the noise of the TIOBE index, sometimes out the Top 20, sometimes in. Then the signal got amplified into a virtuous (or vicious if you prefer) circle, attracting more and more people. But I think it could have been another language…

vmagnin · December 30, 2023, 11:02am

Bits or Bytes?

An interesting phenomenon in the TIOBE index is that Assembler has come back in the Top 20 this last decade (#7 in Aug 2022), because of embedded electronics:
https://www.tiobe.com/tiobe-index/assembly-language/

Pap · December 30, 2023, 2:57pm

The FX-702P was the “big brother” of my PB-410. Take the FX-702P, reduce the screen to 12 characters instead of 20, add a tiny “database” nobody ever used - and there, you have the PB-410.
Casio had a few “calculators” with C as their programming language, but the real king was the PB-2000C. That beast had 32/64K RAM, a graphics display, optional BASIC / PROLOG cards, and extension ports for pretty much anything you could wish. If Internet was a thing back then, this would be the machine for new species - digital nomads in the 80’s.
Of course, it was expensive. With a little more money you could buy a C64, so it couldn’t compete. It was just for hardcore on-the-road programmers with deep pockets. Obviously, not exactly a crowded target audience… Still a notable example of 80’s high-tech gadgets though.

vmagnin · December 30, 2023, 3:56pm

I did not know that one, which seemed to be a dream machine! On this site, they say it was produced in 1989. And on this video, we can see it had a dock with a 3.5" floppy drive! And I read here there was also a Pascal ROM.

I have found another one programmable in C: the Casio fx-890P with a 16 bits CPU!

vmagnin · December 30, 2023, 4:10pm

In 1979, Sharp commercialized the PC-1300S, which offers a “mini-FORTRAN” language. But it looks rather like a kind of BASIC with DO loops instead of FOR.

art-rasa · December 30, 2023, 5:02pm

Good catch, I remembered the specs wrong.

The AT90S2313 provides the following features: 2K bytes of In-System Programmable
Flash, 128 bytes EEPROM, 128 bytes SRAM, 15 general purpose I/O lines, 32 general
purpose working registers, flexible Timer/Counters with compare modes, internal and
external interrupts, a programmable serial UART

samharrison7 · December 31, 2023, 8:44am

I’m curious what libraries you’re referring to here? Perhaps ones that I’ve never come across, but when I think of all the prime candidates for things that “Python is blatantly not suitable for”, it’s libraries like SciPy and NumPy, which farm their computationally heavy stuff out to compiled languages (even Fortran). To my mind there’s nothing wrong with that - an easy-to-use scripting language as an interface to faster languages to do some heavy lifting if the need arises.

You’ve got to remember that for the vast majority of coders, speed is not an issue. Scientific computing is a minority, and computationally demanding scientific computing even more so. Python has a rich, well documented ecosystem, so if speed isn’t an issue, why would folks choose another language? Type safety could be another reason to ditch Python, but most folk aren’t writing production code where this is a big enough problem to worry about.

Taking data scientists as an example: In Python, I can load a dataset, fit a machine learning model to it and plot the results in <10 lines of code. If the dataset is small - a few thousand lines or so - this probably only takes a matter of seconds or less to run. I can do it in a notebook so I can easily show it to a non-programmer, and in a few more lines of code I can make it into an interactive web interface. There’s no motivation to chose another language that would just make my life more difficult.

As the Fortran ecosystem develops, thanks to the wonderful people on this forum (and elsewhere), I think Fortran will start chipping away at the Python userbase - starting at those edge cases where people’s code takes minutes to run and they find that annoying. Running Fortran interactively in notebooks, thanks to LFortran, will only help that shift.

Python (and R, and other such languages) are a gentle introduction to programming, and I think the advantages that brings in terms of getting more people into coding in the first place far outweigh the downsides of them becoming “stuck” with that language. Convincing a non-coder to do a bit of exploratory data analysis in Python isn’t too difficult; convincing them to do it in Fortran is (especially if they’re on Windows, which they probably will be).

Pap · December 31, 2023, 9:47am

Nothing wrong with that, but excellent environments for that purpose existed long before Python was something worth mentioning. So I see no reason to worry that one Tab is ok, 8 spaces is syntax error or whatever (and yes, that issue can be solved easily with the right editor, but come on now.)
Fortran packages are indeed used in the background (Scilab, for example doesn’t hide this fact at all, you need a Fortran compiler to build Scilab from source.) However fast libraries written in compiled languages won’t help much if you call them within an… interpreted loop including decision making (which is the case more often than not.)

How exactly it will make your life more difficult is beyond me. We are not comparing Python with FORTRAN IV here. You start with excellent array support and a syntax that’s far from what it used to be 40 years ago.

Totally agree. I won’t use that feature much, but I’m glad it exists, because others find it extremely valuable, and it will definitely help that shift. Personally, I find it just convenient in simple cases. When things start to get serious, it’s time to close the notebook.

Well, I assume you generalize data analysis to everything, so I’ll use an example from recent personal experience. Last summer my nephew asked me to teach him “some real programming”. He is 13, they learn a pseudo-code language at school, and something reminding Logo. He had no idea what “compiling” is. I taught him a little of Fortran and I had no issues at all. We spent about… 3-4 hours in two sessions, and he could already write nice little programs. He liked it and wanted to show off his last “masterpiece” of a program running to his friends. It’s not hard to convince people using Fortran at all, as long they are not brainwashed. If nothing else, it is way easier than C/C++, and not much harder than Python, actually.

I don’t think installing Python is easier than installing MinGW-w64 on that platform. I’d better talk them out of this mega-spyware system and that scripting tool language though. To each their own, of course, I am not really interested on arguing about that.

egio · December 31, 2023, 1:56pm

Personally I like both Python and Fortran and I don’t see them as competitors. I work in fusion analysing experimental data. You can do a lot with Python and the scientific stack (numpy, scipy, etc.).
Speed, well it depends, when you are using arrays of more then 100000 (or even millions) elements and you can use numpy and scipy a few Python loops (of tens of cycles) don’t slow down much.
What is really important is the graphing tools (where you can easily zoom and pan) or stop the program and plot an array.
There is a very fast interactive library pyqtgraph (much faster then matplotlib, based on pyqt that itself is based on QT) where you can easily interact with every elements. That is fundamental to write and interactive GUI that helps you analyse data between a plasma shot and the following one while you are in the control room.

And Fortran?

Well if you know what will be slow in Python (basically every time you are solving an ODE or minimizing a function, or you need a calculation that cannot be done fast enough with numpy) you can easily write a Fortran function, connect to Python with ctypes and you got all the speed you need.

Of course on HPC we use Fortran (and sadly some C++).

The problem is that many people don’t have enough sensitivity to speed.
I remember once, in a laboratory, they were using Maltab to make some non linear fitting of some probes, and were awfully slow. The “interactive” program took some hours to complete. But I felt that it couldn’t be so slow, actually I replaced the fitting procedure with a minimization based on minpack. And the updated program could be run (serially) in just eight minutes (I don’t know if the continued to use the old program, actually).

Cheers

samharrison7 · December 31, 2023, 4:26pm

My example was quite specific for a typical data science workflow: Load a dataset, fit a machine learning model to it and plot the results. Here’s some pseudo Python code to achieve this:

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import PredictionErrorDisplay

# Read in data from a CSV file
df = pd.read_csv('/path/to/data.csv')

# Split into training and test dataset, presuming we want our model to
# predict `dependent_var` and include all other vars (columns) in the CSV
X = df.drop('dependent_var', axis=1)
y = df['dependent_var']
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.25, stratify=y)

# Fit training data using random forest and use fitted model on test data
rt = RandomForestClassifier()
rt.fit(X_train, y_train)
y_pred = rt.predict(X_test)

# Plot the prediction error
display = PredictionErrorDisplay(y_true=y, y_pred=y_pred)
display.plot()

I would be very surprised if you were able to achieve as streamlined a workflow in Fortran. I say “make life more difficult” not because the Fortran code is difficult to write - Fortran has a very intelligible syntax - but because it would simply require a lot more effort to write the code to achieve the same. Just take loading a CSV file as an example: In the Python example, that one line of code will generalise to any well-formatted CSV dataset. In Fortran, I would probably turn to csv-fortran and even with that excellent library, it would take a bare minimum of lines to: read the file, read the variable types, read the header (column names), then loop through all of the columns to read them into separate variables. Then I need a library to perform the random forest fit (is there one? If not, I would have to write my own). Finally, I need a library to plot the results, and I’m sure you’ll agree that plotting in Fortran could be better supported…

I’m thinking of colleagues who are experimentalists. They collect data from their experiments and usually analyse that data in Excel (which I hope you agree is worse than Python ). Sometimes, I can just about nudge them to use R or Python to do the analysis or plot the results. They don’t have any time to spend learning new things, so if I mentioned to them the need to learn about compiling code, data types, etc, I’m sure the answer would be “I’ll stick with Excel thanks”. Of course, in an ideal world, once they’ve realised how useful programming is, I would then gently nudge towards Fortran - but you’re right, that often doesn’t happen.

And actually, in a really ideal world, we would have more people like you teaching people Fortran at an early age!

By the way, I think this type of discussion is very useful to have - it really gets you thinking about the pitfalls and advantages of different languages and workflows!

Pap · December 31, 2023, 7:16pm

You have some points here. I never needed a CSV format specifically but needed others, probably more complicated, and it wasn’t a big deal to write a Fortran program to parse them. If I need a library and doesn’t exist, I will write my own - and will like it because I like programming anyway. Not to mention it will be tailor-made for my needs. Granted, this will take time and some people don’t have the motivation to do so. Others will just use Python because there is a library for what they want and it is “good enough”.
However I am still wondering why other languages that are objectively better didn’t have similar success so that there are not many libraries for them today. Haskell, for example, has a nice syntax, it’s interactive, and has a compiler. And yet you find loads of libraries for… interpreted Python but not for Haskell.

You say it as it is a bad thing… I actually write my own libraries even if there exists one, but isn’t just as “good” as I wanted it to be.

Of course I agree. However I think PLplot for is quite good for “on-the-fly” plots, and for “serious” plotting nothing beats gnuplot, so no need to reinvent the wheel. How hard is it to save the data and then run a simple gnuplot script (even while program is still running) to get your plots, animations, whatever… Years ago, I wrote a Fortran program for processing and visualizing results taken from a digital microscope. Gnuplot is used interactively using the mouse or keyboard, so the user could plot a surface based on the microscope data, magnify a part of it and reprocess that region with additional criteria, etc… all that while the Fortran program was running.

I know exactly what you mean here. I’ve met such people. Let me be a bit frank by saying those “Excel-lovers” will never bother to learn R, which would be a great tool for them (and even has a great GUI for whomever likes such kind of things.) They will never bother learning Python either… who cares if it is obviously light years ahead compared to spreadsheets (and, even worse, Excel specifically.) Who cares if it’s easy to learn R, Python, whatever. Those people will never bother learning anything, and I wouldn’t call them scientists… whatever that term means, they aren’t.

I guess it’s easy at an early age because they are not brainwashed yet. But I’m not a Fortran crusader, I taught my nephew some Fortran because he asked me “how do you code” and then “show me”. So of course I did, and will do it again. It was a great experience because it was the first time I realized it’s actually pretty easy to teach Fortran even to youngsters. I dare say it’s actually easier than teaching Fortran to first-year University students.

vmagnin · December 31, 2023, 7:58pm

The obvious solution is to buy faster hardware
if you are rich enough and don’t care about energy efficiency and ecological concerns…

Beliavsky · December 31, 2023, 8:01pm

In the three American financial services companies I have worked for, Microsoft Office (Excel, Word, Powerpoint) is dominant. If you are an analyst or trader and do not work in the IT department, you are unlikely to spend your time learning a programming language (except VBA to script Excel). However, Microsoft is bringing Python to Excel and in 2024 I look forward to trying this and showing colleagues how to use Python. I think that for security reasons the Python allowed in Excel will be functions from packages such as NumPy and pandas that Microsoft bundles, not general user Python code.

samharrison7 · January 1, 2024, 10:14am

I think perhaps the explanation is that Python just “got lucky” - right time and right place. I think (and I might be wrong) that it being dynamically typed was a big pull for a lot of people. I’ve got to admit I enjoy not having to worry about types (until it all goes wrong and I spend ages trying to find a bug that would have been caught by static typing ).

I say it’s bad in the sense of taking time. I love writing libraries, I would happily spend my days doing so, and of course there are lots of advantages in terms of getting to know the methods you are using much better. But unfortunately folks are just so time pressured (and funders give such pittance for software development tasks) that if it’s a choice between “write my own Fortran library” or “use a pre-existing Python library”, the latter will almost always win. Even if the library is - as you say, only “good enough”. I wish it wasn’t so!

I’ve got to admit that I’ve not tried PLplot and it’s been a while since I dived into gnuplot (and then I felt there was a steep learning curve - but probably worth it). Again though, it comes back to convenience. Thinking about NetCDF files, in Python I can plot a (1D, 2D, 3D…) variable as simply as…

import xarray as xr
ds = xr.open_dataset('/path/to/netcdf/file.nc')
ds['var_name'].plot()

It’s hard to match that brevity in Fortran. Though that’s largely because the libraries don’t exist - if there was a Fortran version of Pandas or Xarray, in particular if they offered plotting capabilities, it would be a bit of a game changer in my opinion. If I had the time…

Some people meet your description, sure. But there are lots of folk that I feel just need a gentle nudge, and crucially the barrier to getting them to take those first steps into the world of programming needs to be very low (or more to the point - doesn’t take time to learn) - hence R, Python etc.

egio · January 1, 2024, 12:19pm

Well, with the wrong algorithm there is no hardware that can help you!

hkvzjal · January 1, 2024, 1:04pm

For plotting in Fortran I found this library GitHub - kookma/ogpf: ogpf is Object based interface to GnuPlot from Fortran 2003, 2008 and later to provide a Fortran API on top of gnuplot that almost resembles the feels of plotting with matplotlib. I think that something like this would be worth having as one of the fortran-lang libraries such that there is at least a reference easy-to-use plotting API in Fortran.

Pap · January 1, 2024, 1:46pm

Python, as a scripting language, doesn’t have a compiler to do static type checking. So of course it’s dynamically typed. I consider this a huge minus but, as you said, many will take it as a big plus.

Definitely worth it, even though gnuplot’s manual is not exactly well structured. The information is there, but could be more organized and lacks examples - however there are plenty of websites filling that gap. And no, the program won’t be as short as your Python example because of the lack of libraries. I am aware of the library @hkvzjal mentioned, maybe that’s the answer here, but I can’t confirm that. It is in my long “to do list” though.

You are right, I should not generalize. Some people do need a “gentle nudge” indeed. I guess I lack the patience needed for that. I tried a few times, but whenever I face a wall of deafness I tend to walk away and not coming back. I am not the right person to “save the world”. Still, I doubt people who think all it matters is convenience and reducing the time to learn anything new will go further than Python anyway.

vmagnin · January 4, 2024, 9:38am

Following Niklaus Wirth death, I have just discovered the Wirth’s law:

It will make a good reading for today:
N. Wirth, “A Plea for Lean Software” in Computer, vol. 28, no. 02, pp. 64-68, 1995. doi:10.1109/2.348001

Topic		Replies	Views
Fortran in the TIOBE Top 10	53	3229	February 6, 2025
What is the superiority of Fortran over alternative languages like Chapel or Julia?	49	10265	August 13, 2021
Fortran returns to top 20 TIOBE index Advocacy	285	23959	June 8, 2026
Discussion of Fortran on Hacker News	29	2902	January 6, 2024
Why abandon Fortran for Linear Algebra?	46	7356	August 11, 2021

The counter-intuitive rise of Python in scientific computing

Related topics