The recent discussion on this forum about fortran-not-so-fast-as-they-claim motivated me to find some more time to cleanup and release more of my codes and make them available to the broader Fortran community.
So there we go, I’m releasing a library of fast math functions I’ve been creating over time:
This library contains approximated versions of
I’m pretty happy with both their accuracy and speed, they’re more accurate than needed for most of my cases, but I’m sure their performance will vary wildly with compiler/cpu/architecture.
Here’s some runtimes of the testcase on my EPYC workstation + gcc-12 + -O3:
*** fast exp(x) test *** quintic spline : time = 12.5000 ns/eval, speed-up= 1.79X, relerr= 1.10814E-006 cubic spline : time = 11.1979 ns/eval, speed-up= 2.00X, relerr= 2.39339E-004 intrinsic: time = 22.3958 ns/eval, speed-up= 1.00X, relerr= 0.00000E+000 linear : time = 13.5417 ns/eval, speed-up= 1.65X, relerr= 1.90068E-002 degree 2 : time = 14.5833 ns/eval, speed-up= 1.54X, relerr= 1.78246E-003 degree 3 : time = 14.0625 ns/eval, speed-up= 1.59X, relerr= 1.23874E-004 degree 4 : time = 14.8438 ns/eval, speed-up= 1.51X, relerr= 7.19199E-006 degree 5 : time = 15.8854 ns/eval, speed-up= 1.41X, relerr= 3.56917E-007 *** fast log(x) test *** spline(5): time = 3.1250 ns/eval, speed-up=10.00X, relerr= 2.72011E-006 spline(3): time = 3.1250 ns/eval, speed-up=10.00X, relerr= 3.93305E-005 intrinsic: time = 32.0312 ns/eval, speed-up= 0.98X, relerr= 0.00000E+000 *** fast 1/sqrt(x) test *** intrinsic: time = 2.0312 ns/eval, speed-up= 1.05X, relerr= 0.00000E+000 quake3: time = 1.5625 ns/eval, speed-up= 1.36X, relerr= 9.36080E-004