Fast_math: A collection of functions for fast number crunching using Fortran

Yes, I participated in that discussion and it motivated me to seek an approach that would give some kind of compromise between the accuracy and speed issues.

This, or having a fully random walk before adding up could minimize the error, yet indeed the price in computational time would be too high.

That’s why I came up with the idea of “chunking” the sum/dot_product within a small vector that would separate sequential values, yet keep crunching incoming batches of memory as they come and avoid needing to jump in memory or sorting. (The idea came in part from one of the papers cited in that thread )

This is extremely important! A key use-case is the dot product required in iterative solvers that has to be synchronized at each iteration. I’ll take a read to the paper!! Thanks :slight_smile: … I could imagine that a vectorized approach could also be extended to the reduction step for distributed sum/dot product to be as consistent as possible.

3 Likes