The counter-intuitive rise of Python in scientific computing

If you have access: “Why scientists are turning to Rust”
https://www.nature.com/articles/d41586-020-03382-2
Some advantages of Rust:

  • “the compiler produces particularly informative error messages”
  • the Rust compiler, named Cargo, seems to be a Swiss-knife: “to compile Rust code, run tests, auto-generate documentation, upload a package to a repository and more…”
  • “It’s because the community is so fantastic.”

They talk about C, C++, Python, R, Matlab, etc. Not Fortran.

1 Like

From the article:

But for many Rustaceans, the human element is equally compelling. Hauck, a member of the LGBT+ community, says that Rust users have gone out of their way to make her feel welcome. The community, she says, has “always made an effort to be extremely inclusive — like, very much aware of how diversity impacts things; very aware of how to write a code of conduct and enforce that code of conduct”.

What does this mean? Which programming language communities exclude LGBT people? Wikipedia says

Rust was originally designed by Graydon Hoare at Mozilla Research, with contributions from Dave Herman, Brendan Eich, and others

Eich, who also created JavaScript, was terminated as CTO of Mozilla for being politically incorrect. I worry that cancelling people who do not follow a “code of conduct” can exclude valuable contributors.

1 Like

Sure, I think you’d have to dig a little to find outwardly LGBT-hostile language communities. But for many people, it’s a big plus to have a community that advertises itself as caring about LGBT issues and has a track record of enforcing a Code of Conduct to back it up. It’s the difference between “not being negative” and “being positive”.

3 Likes

Seen from France, this paragraph of the paper sounds really quite strange, I mean in a scientific journal. But if it is there, probably there is a social reality behind…

When talking about programming and Fortran, should we care about the private life of each other? OK, if you commit crimes, it could be a problem… And crimes can be relative, as we can see in George Orwell’s “1984” newspeak:
https://en.wikipedia.org/wiki/Newspeak#List_of_Newspeak_words
But don’t worry, if you commit thoughtcrime (or crimethink), and other 1984 crimes, I swear I will still talk with you!
Let’s keep the Fortran Discourse a doubleplusgood place…

UPDATED: to stay in the subject, do you know there is a song called “Julia” in Eurythmics’ 1984 album? It’s serendipity, isn’t it? Hum… no song about python or any snake (a symbolic animal)…

“The Rust compiler”, does that mean there is only one?

Essentially, yes.

And the Wikipedia page says it is based on LLVM, and that it can compile itself since 2011.

Modern Fortran is one of many array programming languages, but an advantage it has as a compiled, statically-typed language is that loops do not incur a speed penalty and are not frowned upon, and you can program in the style you want. On a Hacker News thread people complain about being “forced” to write Python/NumPy and R code in a vectorized manner.


systemvoltage
13 hours ago [[–]](javascript:void(0))

Pandas is basically impossible for me to use without dozens of Google searches after being familiar with it for over 7 years. Ofcourse, I don’t use it daily but its one of those pieces of software that has a very non-intuitive API. Does any one find it difficult to use it or its just me?

In particular, I find this answer infuriating [1]. I’ve come across it so many times. Look, I have a CSV with 200 rows and I need to loop through them in the most intuitive way. Sure, its not optimal but I don’t want fast code. I have a mental model of how to modify this dataframe. Let me do it, please.

[1] python - How to iterate over rows in a DataFrame in Pandas - Stack Overflow

reply
40x1 dwohnitmok 10 hours ago [[–]](javascript:void(0))

I understand where you’re coming from (yes yes there’s a theoretically “good” way of doing this, but come on, why can’t I just do the simplest thing), but I also sympathize with the spirit of that SO answer (although I agree that its presentation is wanting).

Pandas is really a DSL unto itself and is heavily influenced by R, where the same dynamic happens. Programmers coming from a background where procedural control flow constructs are basically second nature bump up against statisticians for whom array-based programming (in the form of overloaded mathematical notation acting on both scalar and vector values) is second nature.

R and pandas are both very array-oriented programming languages (the most extreme example of this might be early-era APL) and it’s really going against the grain to implement things with explicit iteration.

It’s kind of like trying to program in Python without using loops or list comprehensions and asking just how to do everything in recursion. You can… but someone is bound to point out that doing everything with recursion (and the concomitant trampolines to prevent stack overflows) is not the Pythonic way.


1 Like

True, it was not until the late 2000’s that I began using the array syntax in Fortran. But if I prefer to write something more classical like do i=1, 10000 ; a(i) = a(i)+b(i) ; end do instead, I can. And it will not impede the speed of my program. At least, not significantly as far as I remember.

I am now discovering GNU Octave (compatible with Matlab) which seems to have also a vectorized philosophy. But I am an absolute beginner…

1 Like

I just wanted to react to this: indeed, our private lives are our own and let’s keep fortran-lang related spaces focused on Fortran or related to Fortran and its community, exactly as you said.

That is already the case, this Discourse as well as other repositories under fortran-lang are governed by a Code of Conduct, you can find a link to it from here. In particular, quoting from it:

Examples of behavior that contributes to creating a positive environment include:

  • Using welcoming and inclusive language
  • Being respectful of differing viewpoints and experiences

Examples of unacceptable behavior by participants include:

  • Trolling, insulting/derogatory comments, and personal or political attacks
  • Public or private harassment

Which is very clear that, e.g., politics is off topic. And that we should try as much as we can to be welcoming and inclusive to everyone, no matter your personal life or opinions.

3 Likes

After this long discussion - what I felt was - it’s better that if we believe that Fortran is best for us and it will be best for others too - then we make tutorials. Each one in his specific domain can make tutorials - then the language will spread and can be adopted by others. Otherwise the amount of resources that other languages have will just evade Fortran into oblivion.
Again - if we feel that this will do good for others - in the sense that, their life will be easier - coding in Fortran(for scientific computing of course…) then we can make tutorial and share in this website - it will be practice for us and others as well. The standardization that goes into the major languages is just too much. In Fortran - right now it’s a mine yet to dug and discovered. What Fortran can do - many of us don’t know - excluding some giants of course…

1 Like

Yes, the problem is that every float is considered a real*4 and not by the LHS data-type. I can’t count how many times I have found these bugs in legacy codes, and even in new codes.
It just isn’t intuitive.

to summarise from another post I would say python is popular because its ability to give you numbers(sometimes random) where Fortran fails.
between scientists, at large, the quality of the python code is in general like the Fortran one, abysmal.
one thing python did good is liberated scientific the community of not invented here syndrome.

What is that syndrome ?

REAL :: PI = 22/7

too clever by .14159…

1 Like

Is 22/7 an euclidean division or a real division? :smile:

Exactly. But that should surely be integer division. :slight_smile:
We also need some consistency in data types.

the “pleasure” of a lot of scientists to write their own algorithms for things they shall not and use libraries… and here I am not speaking about subtle mathematical functions that are in standard but one need to control precision or speed…

in fairness to them for the younger ones who may not get this… in python3 / means division like humans know it…

Yes, in Python 3, euclidean division is //:

>>> 22/7
3.142857142857143
>>> 22//7
3

In GNU Octave (thus also Matlab):

>> 22/7
ans =  3.1429

In languages where / is euclidean when both numbers are integers, it’s a big trap for students! I need to explain that there is two different circuits in the CPU to compute either on reals, either on integers.

1 Like