Dot product in single precision emulating double precision

Beliavsky · January 17, 2025, 1:04am

There is a new project

implementing the algorithm in the paper Accurate Sum and Dot Product by Ogita et al., (2005).

Testing the eftdot module with the main program below, the error free transformation does enable a more accurate computation of the dot product in single precision, at the cost of higher CPU time compared to computing the dot product in single or double precision using the intrinsic function. I don’t know if this is of practical importance.

use eftdot, only: sp, dp, dot2_s
implicit none
integer, parameter :: n = 10**7, niter=10, ntimes=4
integer :: iter
real(kind=sp) :: x(n), y(n), result, xy_sp, xy_dp, t(ntimes), dt(ntimes-1)
print*,"n, sp, dp =", n, sp, dp
print "(/,*(a12))", "dot_product","err_exact","err_sp"
dt = 0.0
do iter=1,niter
   call random_number(x)
   call random_number(y)
   x = x - 0.5
   y = y - 0.5
   call cpu_time(t(1))
   call dot2_s(x, y, n, result) ! single precision emulating double
   call cpu_time(t(2))
   xy_sp = dot_product(x,y) ! single precision
   call cpu_time(t(3))
   xy_dp = dot_product(real(x, kind=dp),real(y, kind=dp)) ! double precision
   call cpu_time(t(4))
   print "(*(f12.6))",xy_dp, result-xy_dp, xy_sp - xy_dp
   dt = dt + t(2:ntimes) - t(:ntimes-1)
end do
print "(/,'times',/,*(a12))", "exact_sp", "sp", "dp"
print "(*(f12.6))",dt
end

Sample output is below, compiling with gfortran -O3 -march=native -flto

 n, sp, dp =    10000000           4           8

 dot_product   err_exact      err_sp
 -202.377380    0.000000   -0.007553
   26.850029    0.000006    0.007391
  -44.441853    0.000004    0.009468
 -591.380188    0.000000    0.015259
 -235.354996   -0.000015    0.024811
   50.949928   -0.000004   -0.006489
 -185.757919    0.000000   -0.004684
  -78.702240   -0.000008    0.000664
   45.996658   -0.000008   -0.006851
  366.751495    0.000000    0.013306

times
    exact_sp          sp          dp
    0.203125    0.125000    0.062500

hkvzjal · January 17, 2025, 7:20am

I compared it with the implementations in fast_math (fprod and fprod_kahan) on Compiler Explorer (you’ll see n=1e6, otherwise there was a Segfault because the memory is limited under CE)

gfortran 14.2 -O3 -march=native

n, sp, dp =     1000000           4           8

 dot_product   err_exact      err_sp   err_fprod    err_fsvd     err_fpk   err_fksvd
  -18.873257   -0.000002    0.000046   -0.000002   -0.000179   -0.000002   -0.000004
   -3.847342   -0.000000    0.000829   -0.000000    0.000063   -0.000000   -0.000001
  -43.166283    0.000000    0.000137    0.000000    0.000217    0.000000    0.000011
  -13.259282   -0.000002   -0.001365   -0.000002   -0.000026   -0.000002   -0.000001
  -95.034035    0.000000   -0.002991    0.000000    0.000015    0.000000   -0.000015
   98.777199    0.000000   -0.003441    0.000000   -0.000145    0.000000    0.000023
  -37.271225    0.000004    0.000050    0.000004    0.000118    0.000004    0.000004
  -11.045083   -0.000002    0.000486   -0.000002    0.000035   -0.000002   -0.000014
  -45.170433   -0.000008    0.000225   -0.000008    0.000347   -0.000008   -0.000008
    5.349956    0.000002   -0.000184    0.000002   -0.000067    0.000002   -0.000001

times
    exact_sp          sp          dp    fprod sp    fprod dp   fprodk sp   fprodk dp
    0.054966    0.012406    0.013277    0.005627    0.014181    0.006836    0.017551

More exhaustive testing would be needed, but already from here I see that the precision is not necessarily better and timing-wise there is a x10.

Topic		Replies	Views
Fast_math: A collection of functions for fast number crunching using Fortran Announcements	12	1214	October 3, 2023
Array intrinsics performances/accuracy	20	878	May 20, 2023
Arbitrary/multiple precision libraries	7	366	May 26, 2025
Single, Double or Mixed Precision? Help	14	1142	February 7, 2024
Computing the mean: accuracy and speed	20	1442	June 3, 2023

Dot product in single precision emulating double precision

Related topics