Looking at the control-flow graph in Compiler Explorer, there is one path which is free of malloc
’s and I speculate this is the case when the matrix is contiguous. In any case compilers are able to detect the generation of array copies and report it. In gfortran the option is -fcheck=array-temps
.
It’s good to be aware of the problem, especially if the code is meant to run on a large machine and for lots of iterations. But people make dozens of copies all the time in MATLAB or SciPy, and very few seem to care about it. For example in scipy.linalg.solve
(a wrapper for ?GESV
) the input matrix and right-handside are not overwritten by default and users leave the module make copies all the time.
In case of gfortran, writing BLAS wrappers could be avoided in favor of the intrinsic matmul
+ the compiler flag -fexternal-blas
. More information can be found here: Mapping matrix & vector arithmetic to BLAS calls - #6 by ivanpribec