Some reflections about stdlib, where it stands and what it could become

I couldn’t agree more! Does anyone have some insight as to why this (apparent) lack of interest? I suppose that, on the long term, lfortran might be shipped with a precompiled version of stdlib (correct me if I’m wrong @certik). What makes the other compilers hesitant?

Yes, we plan to. In fact we just last week resumed working on compiling the latest version.

Regarding how to ship it with the compiler: wouldn’t it make sense to ship fpm and ensure things work well, and users can simply depend on stdlib as any other Fortran package?

Regarding specialized (fast) implementations, what are some examples from stdlib that would make sense for compiler vendors to optimize, other than just regular Fortran optimization? Ideally we can maintain the optimized version as part of stdlib, in Fortran.

2 Likes

I don’t know if it would totally fit this point but, (and I’ve been intending to open a dedicated thread), stdlib ships several procedures as elemental. The problem is that as of now, beyond compiling with -flto(gfortran) or -ipo(ifort), max performance can not be obtained from these procedures, simply because of lack of inlinement. I’m of the opinion that elemental procedures (at least without the impure attribute) should be inlined by default. Now, one could say “well, JUST use -ipo!”, it is not that easy with large projects.

I can see several reasons, actually:

  • vendors are not directly involved in the development of stdlib, but users would complain anyway to them in case of quality issues (bugs, performances…)
  • stdlib is (obviously) not a established and famous library, with decades old usage
  • they don’t have enough resources to dedicate some people to the maintenance
  • for some vendors (NAG), the content of stdlib may eventually compete with what they sell besides the compiler itself
  • which version should be shipped? It’s not as simple as the C++ stdlib for instance, where the content of the library is defined by the standard.

They should get inlined with LFortran, since LFortran stores the actual code of the module in a .mod file, and then when we compile with LLVM, it can get inlined at the LLVM level, not at the link time, which often times is too late. For very large projects and incremental compilation one has to do more (separate compilation, etc.), but I think this issue of inlining should be solved at the Fortran level, for any code.

2 Likes

old, but gold :slight_smile:

1 Like

Regarding the performance issue, which is also related to the inlining question, there is still the (unresolved, I think) question about what users should expect from stdlib.

At one extreme, users might expect exact results, down the the last bit for floating point results, and with all possible internal errors either avoided or detected, but with possibly poor performance; in this extreme, an implementation would only be concerned with things like inlining, series truncation errors, or underlying algorithms in general, as secondary issues. If a programmer wants a faster and possibly less accurate or less robust algorithm, then he would be expected to write it himself.

At the other extreme, the library might focus primarily on performance, possibly even resulting in less accurate results, inlining would be assumed when practical, and possibly it might not even be expected to recover from internal error conditions. In this case, if a programmer wants a more robust implementation or more accurate results, then he would be expected to write his own version of the code that suits his demands.

As far as I know, these choices have not been made for either the intrinsic fortran library or for the community-maintained stdlib.

If this choice is not made in a public way, then a programmer might be required to maintain his own code for all possible library subroutines and avoid using stdlib altogether. If a particular implementation adopts one of those extremes that doesn’t match the programmer’s requirements, then the programmer must write his own code. When he uses a different compiler that makes different choices, then he already has his own code that must be maintained anyway, so why would he bother using the stdlib version?

I think this should be discussed by the community and some kind of democratic decision made, but my preference would be to have robust and accurate implementations of everything in the standard fortran library and in stdlib, and then require the programmer to do something special if he wants optimal performance. That “something special” might be linking to his own code, or using compiler options to inline code, or linking to special vendor-supplied versions of stdlib, and so on. But I think it would be a mistake to avoid addressing this issue since the consequences are that the library will not meet the expectations and requirements for programmers, and they might simply avoid using it.

1 Like

Adding fypp to the project makes a lot of sense

Wow, brilliant reflection JC @loiseaujc, thanks for sharing!! Another very good tool that deserves to be modernized and added to the portfolio either under stdlib or under the fortran-lang namespace would be the fishpack library; see the original version here, and a modernization effort here.

Fair enough.

I feel like this is (partially) a chicken/egg problem. I do understand that stdlib is not as well established as other libraries. Having it not shipped (or promoted in some ways) by vendors however does not help it be more famous, and hence does not increase any incentive they would have to include it in their distribution. On the other hand, if they ship it, its usage is far more likely to increase, and hence the incentive to distribute it as well.

I guess this is partly related to the comments by @RonShepard regarding performances vs. robustness as well. If they do not want to ship the community-version of stdlib, but still want to promote Fortran modernization (otherwise what’s the point of continuing the dev of the compiler), they could possibly re-use some interfaces provided by stdlib but replace what’s under the hood with their own implementations. I suppose that may be relatively easy for instance for Intel to provide stdlib_linalg with their own mkl backend behind the scene. I may be quite naïve though and there are probably some more profound or technical reasons why they haven’t (yet hopefully) say they’ll do something along these lines. Let’s say that having gfortran or ifx come with a version of stdlib (possibly optimized or not for the moment, I don’t really care) would help a lot for the overall Fortran ecosystem and will definitely be on my Xmas list.

I’m most familiar with linear algebra module, but my guess is this where most of the meat is. As I’ve said a few lines ago, Intel already has their mkl which can easily be used as backend. I don’t quite know about the other modules though.

Thanks Pedro. fishpack has definitely been on my radar for quite some time. I don’t play around with Poiseuille or Channel flow as much as I used to so I’m not needing fast Poisson solvers quite as much as before, but it definitely would be beneficial to have it modernized. Given that solving separable elliptic PDE is kind of a niche in the grand scheme of things, I’d say it would make more sense to have it under the fortran-lang umbrella than directly incorporated into stdlib. I don’t know.

1 Like

Totally agree! Now, the Julia community has proven that it is possible to make up (partially) for the time dimension by the space dimension: a large community using/reporting bugs/contributing can also serve this purpose. It seems to me that for any foreseeable future stdlib will remain a community-driven project, as such, it is up to the community to use it>report bugs>contribute, to battle-test it in different scenarios. I’m sure that support by compilers will emerge as a consequence of gained popular momentum. This would break the chicken/egg problem.

Perhaps making a “distribution” including one or more compilers, fpm, stdlib an a few others, all “precompiled” and ready-to-use, like winpython? Winpython is a Python distribution for Windows shipping a gazillion of packages (see list here) plus the interpreter (plus the Spyder IDE, VSCode, and other tools) in a simple zip file.

Mingw-w64 is another example on Windows, it packages gcc plus a package manager, a console and some utilities. Similarly at equation.com. It is built “on top” of the compiler, which allows shipping additional libraries without touching the compiler code.

1 Like

Maybe this isn’t the right place, but I thought I would share why I don’t use stdlib (although I would like to eventually).

The main reason is the documentation. I don’t like how Ford looks and I find it difficult to navigate. It seems to be just gigantic unsorted lists of procedures/modules. I don’t know if its a thing on how stdlib setup their documentation, but the font is gigantic so the interfaces take up a lot of the screen and declarations overflow to the next line. It might be due to being incomplete, but half the procedures I look at don’t describe what they do (they just say what variables are input and what they might return) so I am confused on if I want to use them or not. Finding documentation hard to understand is why I switched from fpm to cmake in my projects (I went Makefile → fpm → cmake).

My second reason is related to what I mainly use Fortran for. I mainly use it for small-medium linear algebra heavy codes so I heavily use blas/lapack. I like being in control of my workspaces and how I call lapack so I just call them directly (no need for the stdlib wrappers). What I would like to use stdlib for are the things lapack doesn’t have. A lot of these things end up being small tasks, like sorting a vector and getting the sorting indices back, or something like Numpy’s isclose. If I check stdlib and find it has something similar to it, I decide that I would rather just spend the time coding it myself so I don’t have to add a large dependency to my project. Maybe if I knew stdlib had all of them before I started coding I would use it from the start. Last week I needed something like Higham’s expm which I couldn’t find in stdlib so I coded (a basic version) by hand.

4 Likes

Quite the opposite actually! I think it is exactly the right place. It is always good to have different points of view.

This is a fair point if you only look at the Modules or Procedures tab. A more informative one though is Contributing and specs (a bit of misnomer admittedly). It has a list of module-specific documentation with examples, information about the different interfaces, etc. Sill a pretty long list, but with much better formatting and far more practical I believe.

Porting an expm routine to stdlib has been on my todo-list for quite some time. As I’ve said, my current understanding of stdlib is that it has pretty much parity level with numpy. Matrix exponential on the other is provided by scipy, and thus still is work-in-progress. Feel free to open a pull request if you want. Sure enough, it may divert you a bit from your day-to-day job, but it’ll serve a greater purpose (fancy words I admit).

1 Like

That’s a fair point. Automatically generated FORD docs can be thought of more as a technical documentation of the code, which is different from documentation in the sense of user manuals (as the term “documentation” is often used). @loiseaujc already pointed to more of a manual-style documentation, but it will require some update for more end-user accessibility eventually (as pointed out in the TODO). What may ultimately help with that is the better integration of FORD and sphinx.

3 Likes

I find dumping the descriptions to text (my favorite method is “lynx -dump $URL”) that are in the specs and putting them into a file is particularly handy, as utilities like grep and vim allow you to quickly locate keywords in the descriptions and move examples into files. It is particularly useful when working on an off-web platform. The same issue looms with the package repository – we have not specified a preferred description format. Something like markdown (pick your favorite flavor, but a lot of projects are on github and have some github markdown files already) seems like a good format as it is readable without infrastructure in a CLI environment but can be converted to HTML easily. I use a preprocessor that starts with a markdown file and extracts only lines between “fortran" and "” and “```c” and the same myself and that works well, as the markdown file itself is the project source and displays nicely on github and is maintainable via GNU/Linux tools (aspell, vim, …) but I don’t expect everyone to adopt that as the norm no matter how advantageous it is in a pure CLI environment; but if you follow a few simple rules in your markdown it can be easily converted to man-pages (optionally) sans graphics and LaTex-like formulas. A number of packages like git(1) do something similar.

Put the repository needs to have conventions on documentation of at least an abstract so tools like fpm-search can provide descriptions and search for functionality in packages. Experimenting with stdlib documentation to define such a standard might produce a solution for the repository in general and improve stdlib as well.

Now that several regex interfaces are available for Fortran and several text attribute modules perhaps a Fortran-source minimal markdown viewer that could be bundled with fpm is feasible. The “extra” field in the fpm.toml file would be a natural starting point to point to the documentation as a URL and/or file. The result could be, using something like the dependency tools recently mentioned that I would be in a terminal window in an fpm package and enter something like

  fpm list dependencies

and see that the package uses “stdlib”. I could then enter

  fpm doc --dependency stdlib

and it would take me to the documentation, or pull it with wget/curl, or …

and I could look for a certain keyword in an fpm package in (or not) a repository with something like apt-get search. We can limit the search to the manifest file keywords or tell it to look through at least any markdown files listed in the manifest as well, at least until the repository gets too big (which would be a nice problem to have in some regards).

It is possible to do with using a comment convention and placing at least plain text in the source itself, which provides self-contained documentation. Tools can then extract the comments and process them as required. This is how my own indices are generated automatically (https://urbanjost.github.io/general-purpose-fortran/docs/man3.html).

I could picture an extension to ford where it could distinguish between developer and user documentation and provide a user doc as well as a developer doc (it leans more towards the developer doc currently, in my opinion).

We could let the manifest file point to any URL or file, but suggest a simple markdown file with at least an abstract go into the doc or docs directory in the package that the fpm repository could leverage, and that an fpm-search tool could use as well.

Just some random thoughts on the topic, but the desired take-away is that we start discussing providing some standard minimal required documentation for packages and that stdlib seems like a good starting point; and that the manifest file containing a pointer to it seems like a minimal requirement.

I think all recent languages provide for some form of source-code-resident documentation and a related search tool and display command. It would be a nice feature for the proposed standardized preprocessor to have – providing for an identifiable free-format text block for help text, like lines between “doc" and "”. A lot of fpp programs allow for C comment blocks so that is a small leap to consider.

Standard fortran does not allow this, of course, but programmers have been doing the embedding part for some 40 years with the de facto standard prepossessor.

#if 0
This subroutine computes....
#endif

The extraction and formatting part is still ad hoc. One problem with these last steps is that changing the embedded documentation while leaving the code unchanged can trigger an unnecessary recompilation sequence.

Nearly all source files I maintain use the prep preprocessor which has the directive

 $BLOCK --file FILENAME
      :
 $ENDBLOCK

where the file is only written to if an environment variable is set and the directory so defined ends in /docs/ or /doc/ to reduce overwriting something that is not explicitly a document file.

The vast majority of the files do not need preprocessing otherwise. That has worked for my purposes very well.

We used to write HTML files and put the code inbetween and but fell out of favor in HTML and gradually more people knew how to write markdown than HTML so it gradually changed from HTML to markdown; with a good number just being Fortran

FIY: I’ve just submitted a PR (still draft for the moment) implementing the expm function following the original implementation by John Burkardt for the moment. Feel free to contribute if you want :slight_smile: