Scientists are using artificial intelligence and large language models to rewrite old code in modern languages

I think pointers (in the C sense) should be used very moderately. The modern C++ mantra is “no raw pointers”. If a Fortran 77 code was using the default integer to store addresses, it was pragmatism (or short-sightedness) of the original developers. No doubt there may have been concerns like performance (i.e. long ints on a 32-bit architecture), lack of dynamic memory allocation, and so forth.

The size of int was also a problem in C codes, as I’ve written here: Integer 4 or integer 8? - #23 by ivanpribec

Common blocks are indeed a problem. In a code from an aerospace agency, all the big arrays were part of common blocks and they encountered ELF format restrictions (addressing difficulties). This was one of those codes where compilation was part of the application, i.e. the program is recompiled with fixed array sizes (I think NEK5000 does this too). I’ve always seen this as an anti-pattern, but I read it was done like this for performance reasons. On the other-hand explicitly unrolling small-dimensions can be helpful. I believe there are tricks to achieve this in C++. Recently also !$omp unroll was introduced.

In principle common blocks can be refactored automatically by a restructuring tool (see discussion in Making legacy Fortran code type safe through automated program transformation | The Journal of Supercomputing) however it is hard to find the (financial) incentives to grow such tools beyond academic showcases.


Libraries like Eigen and Armadillo can be quite nice and also achieve performance through the use of template meta-programming techniques (one way to imagine this is there is that C++ compilers have an embedded interpreter for a sub-dialect of C++ that is evaluated at compile time). My opinion is, at the end of the day, they still fall short of a full-blown array type like Fortran has. Certainly, C++ performance programmers can achieve very nice things with templates and constexpr, but I wouldn’t recommend this coding style to domain scientists.

This has been said many times before, but an advantage of C++ are the generic containers including std::map (an associative array), std::unordered_map (a hash table) and the associated algorithms. Also the container adaptors like std::stack and std::queue are useful in certain types of problems. I’m afraid that nothing we can build in Fortran (currently) can be as practical or as extensible. Maybe the Fortran generics can fill this gap, but I strongly doubt the committee would be willing to provide any standard algorithms and containers building on the generic feature, due to lack of man-power.

Anyways, if the LANL machine learning approach for Fortran to C++ translation is successful, I see no reason why you couldn’t take the same neural network and retarget it to produce refactored Fortran code. I would recommend writing a good set of tests first.

1 Like