I just want to ask the members of this community, what are the software that you regularly use/develop are written in Fortran 90 and above ? I am not asking the github packages(utilities) linked in the homepage. I am asking full working software.
How many LOC they are ?
If you are developing a software with around 0.2 - 0.3 million LOC have you faced any glitches compared to other scientific/general purpose programming languages ?
I am not sure if I get the difference between the two. My research group has a home-grown Fortran 2018 software (available also on GitHub), the private branch of which is ~150K pure modern Fortran code. No glitches or problems, thanks to heavy and comprehensive testing, close to >98% coverage, with multiple compilers. All users’ complaints so far that have been communicated to us have turned out to be the user’s mistakes/misunderstanding and not due to software glitches. This was not the case early on in the development several years ago when the testing suite of the package was not comprehensive.
That said, I have worked with half-century codebases mixing Fortran66 to Fortran 2003, where the addition of an empty if-block
if (.true.) then end if
in a single subroutine changed the behavior of the program entirely. I do not know if it was eventually resolved, but I am 100% confident it was due to some unknown semantic bug(s) in the code that had gone undetected due to a lack of comprehensive testing and the collaborative development of the software over 5 decades by ~200 graduate students who did not know much about Fortran let alone software development.
Most of the software in the scientific category at the package index is production ready and used in production. I am using several of the projects listed there on a daily basis for my research.
For example, the xtb project is something we are developing in our research group. It currently has ~140k LOC and contains code from 30 contributors since it was started in 2017. It is always a bit difficult to put usage numbers on OSS because we are not tracking our users. The releases have been downloaded ~40k times and it is cloned a dozen times a day. One of the main publications for this software is seeing 20 to 30 citations a month. I would estimate the user base in the several ten thousand if I had to guess.
Another one would be DFTB+ with ~180k lines and contributions from 24 people. DFTB+ is a really exciting project, because it is quite modular and allows easy combination of most of its features. The drawback is that building it with full functionality involves at least a dozen projects and external libraries.
Fortran is working nicely on this scale, we hardly ever encounter issues which make it impossible to implement something. The main issues we have seen in the development of this size of projects are the build systems.
I’m heavily involved into the development of DAMASK (https://damask3.mpie.de, which is also on GitHub). It has about 30k line for Fortran code. The main publication is cited about 80 times per year (google scholar) and I know of several external users from academia.
Distribution of binary packages (we support Debian, Ubuntu, SUSE, Fedora, Conda) and source packages (Arch Linux and spack) requires some time. Using a language-specific ecosystem (Julia, Python/PyPi, rust/cargo) would require much less work but also exclude the use of many libraries.
The main glitch I see for Fortran is the lack of built-in/standardized unit testing and automatic documentation:
For documentation there is FORD. Hopefully it will be improved in the future but it is a long way to reach the level of sphinx which I use for documenting Python code.
I know that there are testing frameworks, but also there the much needs to be done to reach the level of Python, Julia and rust to name the languages I know. One issue of the core language is the lack of a function for catching exceptions: Every test that ends with
error stop needs to go into its own executable to check the return value.
One can find large production open-source Fortran codes at GitHub, GitLab, Sourceforge, Zenodo, and other sites.
wc -l $(find . -name “*.f90”)
You can imagine that this much code is bound to hit some compiler bugs occasionally, but no more than other languages.
I doubt that much of Abaqus is written in Fortran 90. At least user subroutines are expected to have the
*.f extension and the
-free for the Intel compiler needs to be added to the environment.
Apparently FLUENT® is in Fortran (not sure if that still applies today, but I’d be surprised if they did a full rewrite of the 1.5 million lines of code):
In applying AD to FLUENT®, one of the leading commercial computational
fluid dynamics (CFD) software packages, we show that AD is not only applicable
to small academic programs, but scales to large industrial simulation codes.
… we report on the results of applying the AD technology to the
FLUENT code consisting of approximately 1.6 million lines of Fortran.
There are probably dozens of CFD and structural analysis codes in Fortran that are still actively developed and used. Just a few examples:
These are probably all a mixture of both fixed-form F77 codes, and also Fortran 90+ codes. A few more large-scale Fortran codes routinely run at HPC facilities in Germany can be found in one of my earlier posts: High-Q Club at Jülich Research Centre
I’m pretty sure most national weather services rely upon Fortran codes (including for the assimilation of satellite data?). Maybe @milancurcic can name a few.
Side note: With the criterion
.f90 you might actually skip programs still in
.f, or started in
.f, which are relevant for some.
Platon for example, to check crystallographic models of small molecules for consistency, plausibility and editing is an example. If you dig deep enough and search for
platon.f doing most of the work (e.g., here), you find .ge. 175k lines of code grown and maintained by A. Speck since the 1980s and used (under the hood) by multiple publishers within the checkcif service.
Of course, ShelX initiated by George M. Sheldrick in the 1970s then shouldn’t be forgotten either, despite most users get to see only the executables compiled with Intel’s
ifort (ref) when they get in touch with his group, or (bundled with other programs), get them with the purchase of a diffractometer by e.g., Bruker. In the 2014 paper cited, Sheldrick claimed >8k registered academic users (though the impact of the ShelX program might be much larger, given the 2008 review paper alone reached rank #13 of Nature’s Top 100 papers by 2014 based on number of citations.)
On the other hand, in a report about «Exclaim» by Swiss ETHZ to model climate and weather data published this year, Météo Suisse replied to reader Lutz’ question
Est-ce donc la fin du Fortran dans les modèles météo ?
(Is this the end of Fortran in the weather models?)
C’est ce que le projet devra déterminer. Il est possible que certaines parties qui sont très spécifiques comme des opérateurs de l’assimilation de donnée reste en Fortran.
(The project will determine this. It is possible certain highly specific parts like those about data collection will remain in Fortran.)
You could add to that list two electronic structure codes:
- CASTEP (http://www.castep.org/, CASTEP - Wikipedia): Fortran 2003, 614k LOC, used in about 1k publications per year. It started as a Fortran 77 code in the late 80s, and was completely rewritten using modern Fortran 20 years ago.
- ONETEP (https://onetep.org/, ONETEP - Wikipedia): Fortran 2003/2008, 475k LOC, used in about 20-30 publications per year. It implements linear-scaling DFT, the original paper is from 2005, but had been in the works for a few years before.
Both are developed mainly (or almost exclusively) in the UK and they are commercialised by Dassault Systèmes BIOVIA as part of their Materials Studio package. They are also available to researchers via inexpensive academic licenses.
Several main drug development software, I believe their computation engine (using non-parametric adaptive grid search, and Monte Carlo parametric EM algorithms) is written in Fortran, even fortran77.