Fortran Arrays in C++

Not to mention this (auto-)vectorizes nicely. Say with gfortran -O3 -march=skylake-avx512 -mprefer-vector-width=512, the bulk of the work gets done in the hot loop:

.L5:
        vmovups zmm0, ZMMWORD PTR [r15+rax]
        vfmadd213ps     zmm0, zmm1, ZMMWORD PTR [rdx+rax]
        vmovups ZMMWORD PTR [rdx+rax], zmm0
        add     rax, 64
        cmp     rdi, rax
        jne     .L5
3 Likes