The recent discussion on this forum about fortran-not-so-fast-as-they-claim motivated me to find some more time to cleanup and release more of my codes and make them available to the broader Fortran community.
So there we go, I’m releasing a library of fast math functions I’ve been creating over time:
perazz/fastmath: A Modern Fortran library for fast, approximate math functions (github.com)
This library contains approximated versions of exp(x)
, log(x)
, 1/sqrt(x)
and sincos(x)
.
I’m pretty happy with both their accuracy and speed, they’re more accurate than needed for most of my cases, but I’m sure their performance will vary wildly with compiler/cpu/architecture.
Here’s some runtimes of the testcase on my EPYC workstation + gcc-12 + -O3:
*** fast exp(x) test ***
quintic spline : time = 12.5000 ns/eval, speed-up= 1.79X, relerr= 1.10814E-006
cubic spline : time = 11.1979 ns/eval, speed-up= 2.00X, relerr= 2.39339E-004
intrinsic: time = 22.3958 ns/eval, speed-up= 1.00X, relerr= 0.00000E+000
linear : time = 13.5417 ns/eval, speed-up= 1.65X, relerr= 1.90068E-002
degree 2 : time = 14.5833 ns/eval, speed-up= 1.54X, relerr= 1.78246E-003
degree 3 : time = 14.0625 ns/eval, speed-up= 1.59X, relerr= 1.23874E-004
degree 4 : time = 14.8438 ns/eval, speed-up= 1.51X, relerr= 7.19199E-006
degree 5 : time = 15.8854 ns/eval, speed-up= 1.41X, relerr= 3.56917E-007
*** fast log(x) test ***
spline(5): time = 3.1250 ns/eval, speed-up=10.00X, relerr= 2.72011E-006
spline(3): time = 3.1250 ns/eval, speed-up=10.00X, relerr= 3.93305E-005
intrinsic: time = 32.0312 ns/eval, speed-up= 0.98X, relerr= 0.00000E+000
*** fast 1/sqrt(x) test ***
intrinsic: time = 2.0312 ns/eval, speed-up= 1.05X, relerr= 0.00000E+000
quake3: time = 1.5625 ns/eval, speed-up= 1.36X, relerr= 9.36080E-004