Fortran runtime math library

We need a long term sustainable solution for the intrinsic (and non-intrinsic) runtime math library in LFortran. If we design things well, it could be used by Flang (@pmk) and other compilers also.

In stdlib, we plan to implement special functions that are not intrinsic. As an example, bessel functions of integer kind are intrinsic, but half integer kind are not. The quality of the implementation however should be the same and I wonder if we should simply consider both intrinsic and non-instrinsic functions under the same umbrella, the only difference that intrinsic would be used by the compiler itself, and non-intrinsic would be called via stdlib.

Compilers typically have at least two implementations: accurate but slower (no -ffast-math), and slightly less accurate but faster (-ffast-math). Users can choose.

I don’t know how to design such a switch for non-intrinsic functions in stdlib.

I have brainstormed this a bit more here: Runtime library math functions (#244) · Issues · lfortran / lfortran · GitLab.

In particular, in stdlib we would have a good Fortran API. So it would make sense to me to implement intrinsic functions (sin, cos, …) in Fortran also at the API level using a similar design. There can be several implementations, one calling into openlibm using iso_c_binding, another into libpgmath, another into the standard C library, and another can be a pure Fortran implementation.

If a compiler can then use such a Fortran high level API, for example you point it to such an implementation as a Fortran module, then users could easily switch and provide their own preferred implementation. The compiler optimizations could optimize such a “wrapper” away so that there is no additional overhead over calling into libpgmath directly (for example). Furthermore, for non-instrinsic functions they would simply call them from stdlib, but the interface and internal implementation would be almost identical.

To rephrase it: libm is an interface to a math library at the C level. Is it technically possible to provide an interface at the Fortran level? The Fortran API for all instrinsic (math) functions is described in the Table 16.1 in the Fortran 2018 standard. Can a compiler simply compile+link a standalone Fortran implementation, or must such functions be hardwired in the compiler at a lower level for performance (or other) reasons?

There are other instrinsic functions that are not math related, and each compiler must implement those as well, but that is a different issue.

CCing @milancurcic for opinion regarding stdlib and @pmk for Flang.

2 Likes

Another way to look at this: If stdlib was extended to also include intrinsic functions (whether part of stdlib or as a separate library), then I would be interested in using it in LFortran. In stdlib our aim was to create a “reference implementation” preferably in pure Fortran (sort of like Lapack) and allow compilers to provide an optimized implementation (MKL, OpenBlas). Even as only a “reference implementation” that would be useful for LFortran.

Another question is how to provide an optimized implementation, whether this can be done using a Fortran high level API, or whether it requires some deeper compiler integration (but if so, then we should figure out how to provide such optimized compiler integrated version for non-intrinsic functions also).

1 Like

Some ideas about how it could work in stdlib:

  1. Using a preprocessor macro, the user would choose the optimization level at stdlib compile time. Some libraries do this (I know of ESMF, and I bet there are more). This leads to a cleaner API for the user (you always have only one function, optimized or not optimized), but it’s less flexible because to compare the two you’d need two stdlib builds. It also complicates binary releases of stdlib because in this case we’d need to package all binary builds times 2 (optimized and not optimized builds).
  2. Have optimized and not-optimized functions with the same function name, and have them sit in different modules (e.g. stdlib_math and stdlib_math_optim).
  3. Have optimized and non-optimized functions with different function names, but in the same module.

In options 2 and 3, stdlib would ship with both kinds of same functions.

Of course, all of the above pertains to the code being optimized at the source level (to illustrate with a trivial example, is it calling matmul() or is it calling sgemm()?). What the compiler does with each of the variants using different flags is another “unknown”.

1 Like

@milancurcic from the stdlib side, we can figure out the mechanics of optimized / non-optimized versions. The main question is: do you think it’s a good idea to also include intrinsic functions as part of stdlib?

The vector-function-abi-variant lets you at least specify SIMD variants.
While this wouldn’t cover the case of scalar code switching implementations based on -ffast-math, it would at least allow you to provide implementations optimized for scalar and SIMD.

2 Likes

Thanks for the link @chriselrod and welcome to the community!

From my perspective as a volunteer implementing an entire new math library in Fortran is a large effort. The existing libraries seem to be at least partially or even fully supported by compiler vendors or other powerful organizations. As an example many of the libm functions seem to have originated at Sun Microsystems.

I attempted to implement the cubic root previously (Cube root of a real number · Issue #214 · fortran-lang/stdlib · GitHub) copying the low-level method algorithm used in libm. Some challenges I faced in Fortran:

  • How to deal with endianess? E.g. in an implementation you might want to check the sign bit of a floating point number, or access the mantissa and exponent individually. The C math libraries resort to using macros and preprocessing to achieve this. I guess at least initially one would probably support only x86_64 systems and assume there is IEEE support.
  • Compile time errors. If you try to take the square root of a negative literal constant, compilers will flag this is an error. This cannot be implemented as a Fortran module.
  • Verification suite. Given their importance in engineering, scientific, financial calculations, etc., math functions are something you really have to get right. It would seem irresponsible to provide functions which are not very thoroughly tested. Writing tests which cover all the possible corner cases and make sure the values are precise down to the last ULP’s is not a small effort.

Personally, I would prefer to only add functions to stdlib which are missing from the intrinsic runtime library (for example the erfinv function: List of special mathematical functions to include · Issue #179 · fortran-lang/stdlib · GitHub).

But there is at least one good reason, why a Fortran-implemented math library might be used, which is bit-reproducibility: Proposal for bit-reproducible numerical operations · Issue #12 · fortran-lang/stdlib · GitHub

I have no objections to a new math library aimed towards Fortran outside of stdlib (e.g. a joint library between compiler developers, or a library which provides SIMD specializations for the case of calling transcendental functions on arrays, etc.).

2 Likes

As a step that would be required for a stdlib-intrinsics library and could serve value independently, what about a QA test of the intrinsics? Having been involved in issues with SUM() because it is not required to condition the data, with double-precision trig functions that used a table that were jumping one value off (the arrays beginning with 0 versus arrays beginning with 1 differences between languages was the cause) maybe starting there would

  1. be a required step of implementing a stdlib-intrinsics library
  2. might expose QA issues
  3. would be very interesting if it exposed major performance discrepancies
  4. would be of value even if an independent library were not supported
  5. might supply better performance via CUDA or coarrays than some vendor routines

As a footnote, we used to have our own “intrinsics” routines and through a combination of pre-processors and EXTERNAL statements and function pointers could switch between the vendor routines and in-house routines (not just for QA but for performance (mostly vectorization back then) and bit-reproducibility) but that was pre-F90 and is long gone for the most part. There used to be a suite of Fortran77 conformance tests we ran that I think was originally from NIST, and a set of routines called “paranoia” (one was “REAL”, one was “DOUBLEPRECISION”) that might have been from Argonne that still survives but has been heavily modified that is still in use in some in-house scenarios in various places that was really good at finding dusty corner issues even with the same compiler with different options on or off.

So along those lines what about starting with “testing” that might be supported by “stdlib” volunteers or even get some sponsor-ship

Odd how (as far as I know(?)) compilers are not certified by any body as being “2003” or “2008” compliant except by the user community (which has been pretty good as shouting down any vendor that prematurely claimed a “NNNN-compliant” compiler, I suppose. Does a compiler vendor actually submit a compiler for certification or is that totally community-policed?

The community as a whole probably has some experiences that would contribute to “dusty corner” testing that I could picture surpassing the in-house tests compiler vendors have accrued and might be of value to the entire community – compiler and developer and user?

Not sure about the license issues (which might be many) but the GNU compiler has a number of open-source compiler tests; but for legal reasons some lawyers have probably prevented other compiler suppliers from leveraging – a “non-affiliated” set of tests might be welcomed by some vendors?

I think it would also invigorate the development of a generic unit-test module for stdlib, that would also be useful regardless of whether a stdlib-intrinsics library emerged or not.

SUPPLEMENT

dparanoia(3f) and sparanoia(3f) were last maintained by Richard Karpinski, University of California, based on anything I found so far. The output was good but required some in-depth knowledge of various floating-point operations on different platforms to know whether you cared or not about some of the results.

3 Likes

Couple links that were mentioned:

@ivanpribec all the problems you mentioned are already present for any special function implementation that we want to have in stdlib and that we agreed is in scope. If there was a Fortran version, it obviously needs to be tested to ensure accuracy. Otherwise we would call into openlibm or libpgmath or any other such library — but even then it still should be tested.

Yes, stdlib is a large effort. So is writing a compiler.

My idea was to obviously reuse as much as possible, but to allow writing an implementation (down the road) in Fortran could be useful for speed reasons — ultimately, Fortran is meant for numerical scientific computing, which includes fast and accurate evaluation of functions. We have to think long term, as community contributes, we can improve upon things. In the short run we obviously have to reuse what is already available.

My main question at the moment really is about the API, that could be used by compilers such as LFortran:

  • Does it make sense to have (a community maintained) Fortran modules with implementations of intrinsic functions? That implementation initially would call into C using iso_c_binding.
  • What technical obstacles are there?

There are several possible obstacles already identified in this thread:

  • Performance: the compiler should be able to optimize out sin and cos and use machine instructions (if they are faster) and in general get rid of unnecessary “wrappers” whenever possible.
  • Compile time errors: the compiler must have some knowledge of the math functions

On a practical level, the only other alternative is to hardwire for example libpgmath into LFortran in the LLVM backend. This is quite significant amount of work, as the backend has to identify the mapping between the intrinsic function and its actual implementation in libpgmath, the signature of those C functions and how to transform all arguments and results between Fortran and C. In fact that is exactly how we started, you can see here exactly what I am talking about. This approach is non-portable, as we cannot easily switch such runtime implementations, as openlibm or our own implementation would in general have a different C level API (such as passing around complex or quadruple precision numbers among other things). This approach lead me to create this thread.

I don’t know the best way forward, but I am asking the community to help brainstorm what would make sense. And also not see a Fortran compiler as a black box, but rather as a community maintained open source tool.

I just want to comment that writing a good math library is HARD. My experience is that vendor math library teams have advanced mathematicians working on argument reduction and other things that provide “good to the last bit” results across the full argument range. I have personally known several of these.

Nowadays, math libraries are also continually being updated to take advantage of vectorization and new instruction sets. Compilers, as well, understand some of the more common intrinsics and inline them, or do things such as sin-cos merging, or make use of private interfaces that have reduced overhead.

I know you mean well, but if you want to do anything here, focus on functions that are not intrinsic.

2 Likes

Thanks for the feedback @sblionel. Indeed, I know some of those people also. It is not always good to the last bit, but it is usually incredibly accurate (I tested a few functions from the openlibm library and a given function was about 80% accurate to all bits and about 20% of floating point inputs was off by 1 ulp).

For LFortran and Flang, it is the intrinsic functions to focus on. My main question was about the best API for the compilers to use. I think most of the functions are already written in either openlibm or libpgmath, so we can simply use those.

For stdlib it looks like most people agree to focus on non-intrinsic.

This reminds me of an article by Hanson and Hopkins about routines to calculate the Euclidean norm. While in itself, the algorithm is very simple, you have to take care of lots of special situations. Their test suite consists of more than 50 cases. For details see https://www.researchgate.net/publication/298896236.

Another impressive work I would like to bring up here is “The Mathematical Function Computation Handbook” by Beebe. Impressive if only by the level of detail that is used to describe how you may calculate all manner of functions to an accuracy of one or perhaps several ulps.

1 Like

On quality of implementation of Fortran 2008 complex intrinsic functions on branch cuts

Anton Shterenlikht

Branch cuts in complex functions in combination with signed zero and signed infinity have important uses in fracture mechanics, jet flow and aerofoil analysis. We present benchmarks for validating Fortran 2008 complex functions - LOG, SQRT, ASIN, ACOS, ATAN, ASINH, ACOSH and ATANH - on branch cuts with arguments of all 3 IEEE floating point binary formats: binary32, binary64 and binary128. Results are reported with 8 Fortran 2008 compilers: GCC, Flang, Cray, Oracle, PGI, Intel, NAG and IBM. Multiple test failures were revealed, e.g. wrong signs of results or unexpected overflow, underflow, or NaN. We conclude that the quality of implementation of these Fortran 2008 intrinsics in many compilers is not yet sufficient to remove the need for special code for branch cuts. The test results are complemented by conformal maps of the branch cuts and detailed derivations of the values of these functions on branch cuts, to be used as a reference. The benchmarks are freely available from this http URL. This work will be of interest to engineers who use complex functions, as well as to compiler and maths library developers.

2 Likes

Thanks @Beliavsky, nice article.

In a recent thread on the Intel Fortran forum (Need help : different result with the same computation - Intel Community) the reproducibility within one run was discussed. How does your environment deal with this type of noise?

I wonder if a pure-Fortran implementation would always be reproducible and how this can be guaranteed (I am also thinking about Steve Lionel’s presentation on the subject :)).

There’s a current comp.lang.fortran thread Bug in Gfortran/GCC Math Library (?) regarding the single-precision tanh function.