If you’re aware that round-off errors due to finite precision are a problem, you can still use sum
but you need to convert the array beforehand, i.e. sum(real(t,sk))
where sk
> kind(t)
.
Take the following code as an example:
module prec
integer, parameter :: sp = kind(1.0)
integer, parameter :: dp = kind(1.0d0)
end module
subroutine do_sum(n,a,sa,da)
use prec
integer, intent(in) :: n
real(sp), intent(in) :: a(n)
real(sp), intent(out) :: sa
real(dp), intent(out) :: da
sa = sum(a)
da = sum(real(a,kind(da)))
end subroutine
Here’s the resulting assembly listing, when compiled with x86-64 gfortran 12.1 and -Os
flag (I’m using this to keep the assembly listing readable):
do_sum_:
movsx rdi, DWORD PTR [rdi]
xor eax, eax
xorps xmm0, xmm0
.L3:
inc rax
cmp rdi, rax
jl .L2
addss xmm0, DWORD PTR [rsi-4+rax*4]
jmp .L3
.L2:
movss DWORD PTR [rdx], xmm0
xor eax, eax
xorps xmm0, xmm0
.L5:
inc rax
cmp rdi, rax
jl .L4
cvtss2sd xmm1, DWORD PTR [rsi-4+rax*4]
addsd xmm0, xmm1
jmp .L5
.L4:
movsd QWORD PTR [rcx], xmm0
ret
The loop between L3 and L2 is the single-precision sum. The loop between L5 and L4 is the double-precision sum. Note the additional cvtss2sd
instruction which converts a single-precision operand to double precision.
If you replace the sum(real(a,kind(da))
with an explicit loop,
da = 0.0_dp
do i = 1, n
da = da + a(i) ! automatic conversion
end do
you get the exact same assembly instructions. No array temporary is created.