Encouraged by this article, I wrote my own very simple wrapper for arrays in C++. My convolution benchmark run 15% faster in C++ than in Fortran, despite using objects with overloaded ()
operator to emulate arrays. Of course, this got all nicely inlined using a modern compiler and vectorized efficiently. One more nail in the coffin I guess