An FPM registry is under development. I think the Fortran community is more decentralized than that of R or Rust, and there are lots of project that cannot or will not use FPM, as evidenced by my large list of Fortran codes on GitHub. Make and Cmake are also commonly used build tools.
What features should a Fortran registry not restricted to FPM projects have? I would like to branch from my current large but static list to create one. (Access to Codex and Claude Code gives me delusions of competence.) Here are some thoughts.
List the build tool for each project
Try to compile each project with gfortran, ifx, and other compilers and list which compilers it works with. List the standard(s) it complies with by trying to compile with various strict standard options.
Create a way to download entire categories of projects from the list, for example all the projects for numerical integration.
Index the source code of all projects so that queries about which projects implement which numerical methods can be answered instantly. For example, tell the user which projects implement a BFGS optimizer or a Mersenne twister RNG. This task is more ambitious.
After writing the above on my own, I asked ChatGPT what it thought. It suggested recording for each project
last successful build date
compiler matrix: gfortran, ifx, ifort, nvfortran, flang-new, lfortran when relevant
OS matrix: Linux first, then macOS, then Windows if feasible
Just to play devil’s advocate for a moment: are we not soon going to be in a position where it is quicker and simpler to specify to an LLM exactly what we need and let it generate code ab initio, rather than maintain databases of repos containing programs that may or may not do quite what we want? (Set aside the need for the repos to train the AI systems in the first place)?
Actually I think your work on collating repos and literature citing Fortran programs is brilliant.
The hope is that code from a registry would be of higher quality than what the LLM generates. Even if the code quality is the same, I think there is value in having one function calculating say the standard deviation in all your projects instead of slightly different LLM-generated ones in each project. If you think all code should be reviewed by a human, a registry should save time, since code is reviewed once (and by a community). I think stdlib is inspired by this belief.
Something along a wish-list of what an Open Source repository should have I edited from a Changelog file
Changelog
“Do unto others as you would have them do unto you”, as they say. When I
find Open Source resources, I am hoping a lot of these boxes can be
checked …
Base package
annotated source files
an open license and metadata (version number, category, …)
git repository on WWW (eg. github,gitlab, …) or home page
Documentation
README and/or synopsis
user manual (on-line via www and/or Adobe PDF)
man-page or flat-text or html documentation
developer documents (and/or ford(1) or doxygen(1) config files to generate such)
a brief pedigree citing applicable references, major uses, and degree of support
note of any vetting, review, or rating of the code that is available
a feedback and/or patching or “pull request” mechanism
That will not be so good for system requiring stability/reproducibility. Some project also simply do not want to deal with AI code for different reasons. And for things which we do in Fortran - like numerics - LLMs perform poorer than for general programming tasks, because of lack of traning data.
I also think only CMake and fpm matter - because make basically translate to “everyone has a custom recipe, maybe following the conventions of how things are installed”.
The one thing I would add to this list would be a pernament identifier or location, i.e. a DOI. If this is a journal such as JOSS fine, if not a release on i.e. Zenodo would also be good.
Is this intended to be only a registry, or also store copies of the package releases? I would hope the latter. The most important aspects (IMO) will be the security and management features. I.e. who is allowed to upload new releases, for which projects, how can they be removed or deprecated, how will any malicious package uploads be handled, etc. All the other features you list would be nice to have, but not nearly as crucial.