Some Intrinsic SUMS

This is always a choice in designing fortran intrinsic functions. Should the intrinsic always give the best performance, and then require the programmer to write code for accuracy if necessary, or should the intrinsic always give accurate results, and then require the programmer to write fast code if necessary?

Algorithms like in pairsum() may also be implemented without recursive calls. Before recursion was added to the language, that was the only way to implement things like that. The programmer maintains the intermediate quantities that would otherwise be on the call stack. Now, the language supports recursion and makes it a little easier, but sometimes at a performance cost.

As for whether one needs to worry about stack limits, I would point out that in many cases it is the exact same compiler and operating system that is being used on a laptop or desktop machine as is being used on a high end cluster, so if you must worry about it for one you also must worry about it for the other. Further, many sites buy hardware and then run it until it fails and can no longer be repaired. So even if new hardware is available, the programmer must accommodate also 10- to 20-year old machines at the same site (sometimes even in the same cluster).