@dionhaefner Nice paper, it’s well written. I skimmed through it when it first came out, and now in more detail.
I actually didn’t find any concrete or strong suggestions in the paper that Fortran syntax is inadequate for GPU computing (and it certainly is not, IMO). The benchmarks don’t seem unfavorable to Fortran+MPI. JAX seems like a powerful tool. The “Fast, cheap, turbulent” part of the title reads a bit clickbaity. I’m surprised that it was accepted in that form in JAMES. It could be an opportunity for Veros v2 paper to be titled “Faster, cheaper, and just as turbulent”. 
Today, GPUs are the industry standard devices to train artificial neural networks. This trend has also impacted the design of modern compute facilities; for example, out of the 8 upcoming supercomputers in the EuroHPC Joint Undertaking, 7 are going to provide GPU resources, typically making up around 10% of the total compute power (see EuroHPC, 2021). These resources would be unusable with traditional Fortran models without considerable additional effort, such as a complete re-implementation using CUDA Fortran or by using a framework like OpenACC (Wienke et al., 2012), which requires compiler directives for every loop (see also Norman et al., 2015).
I get your point but I’m skeptical whether writing an ocean model from scratch in Python+framework is a smaller effort than GPU-ifying (whether through OpenACC, OpenMP, CUDA, or fine-tuning for nvfortran) an existing ocean model. It seems like a compromise: What you get for free with JAX is building native code for various architectures. What you get for free with an existing ocean model is decades-worth of battle-testedness of dynamical core numerics and subgrid-scale algorithms that you don’t otherwise have.
While the flexibility and rich library ecosystem of Python is a strong asset, there are also some notable obstacles when choosing Python over Fortran. Decades of real-world usage and the relative simplicity of the Fortran language have led to an established community standard of model development. As a consequence, most Fortran models read similarly to each other. This is currently not the case in Python development, where the chosen abstraction and library stack have a huge influence on the structure of the model code. This calls for a collective effort to formalize a common interface for the development of high-performance models in Python. We are confident that this can and will happen should this approach gain the required momentum.
I think this is a fair assessment. I look forward to seeing the developments on both fronts–improved tooling and multi-architecture frameworks and compilers in Fortran, as well as a more unified and stable APIs in the Python ecosystem. I think diversity is good in general, and so is here. I think there should be all kinds of scientific numerical models implemented with various languages and frameworks. Only then we can really understand the pros and cons of different approaches. And, a science Ph.D. is difficult enough on its own–a student should be able to program in the language they most enjoy, and I can appreciate and understand that many people do enjoy Python.
Anyhow, great work, and congrats @dionhaefner, I look forward to reading more.