I just created an example on Godbolt (Compiler Explorer) to compare the assembly generated for adding arrays of derived types vs a real array:
integer, parameter :: n = 10
type(foo) :: a(n), b(n), c(n), d(n) ! derived type
type(real) :: aa(n), bb(n), cc(n), dd(n) ! intrinsic type
! ... fill values ...
d = a + b + c
dd = aa + bb + cc
For the intrinsic reals, gfortran generates the assembly
mov edx, 1
jmp .L20
.L37:
lea rax, [rdx-1]
movss xmm0, DWORD PTR [rsp+960+rax*4]
addss xmm0, DWORD PTR [rsp+832+rax*4]
addss xmm0, DWORD PTR [rsp+704+rax*4]
movss DWORD PTR [rsp+576+rax*4], xmm0
add rdx, 1
.L20:
cmp rdx, 10
jle .L37
One can recognize the loop check in .L20, and the two add operations.
For the derived type, the assembly is
mov ebx, 1
jmp .L19
.L36:
lea r12, [rbx-1]
lea rbp, [0+r12*8]
lea rsi, [rsp+880+rbp]
lea rdi, [rsp+1008+rbp]
call __testfoo_MOD_foo_add
mov QWORD PTR [rsp+1096], rax
lea rsi, [rsp+752+rbp]
lea rdi, [rsp+1096]
call __testfoo_MOD_foo_add
mov QWORD PTR [rsp+624+r12*8], rax
add rbx, 1
.L19:
cmp rbx, 10
jle .L36
Again the loop and two add operations are recognizable, meaning the array operations have been rewritten in a manner consistent with the loop code:
do i = 1, 10
d(i) = (a(i) + b(i)) + c(i)
end do
No need for engineers and scientists to fuss around with expression templates and meta-programming, unless they want precise control over the generated code. This reminds me of some historical threads a NASA report from the time when Fortran compilers did not yet fully support multidimensional arrays (in terms of quality of the generated machine instructions). Apparently, many some engineers wrote their own linear algebra routines with matrices stored as vectors.
Edit: I’ve edited the last paragraph for accuracy