Guidelines for GitHub Fortran repos

Having skimmed many Fortran projects on GitHub, here are some suggested guidelines for them. Except for the suggestions to use free format and avoid non-standard code, I have not mentioned code style. More guidelines are welcome.

  1. Describe what the code does in one or two sentences in the About section and perhaps in more detail in the Readme.
  2. Use topic tags in the About section to make the project more discoverable and quickly indicate what it is about.
  3. Upload source files, data files, and build files such as make files, but not files that are generated by the compiler, unless the goal of the project is to provide such files. It’s better to provide unzipped source files so that others can quickly browse them, rather than uploading them as .zip or some other archive,
  4. Use the .f90 suffix for free source form, not .f95, .f03 etc. Prefer free over fixed source form. There are free tools to convert from fixed to free source form. Don’t describe your project as “Fortran 90” just because it uses the ..f90 suffix for source files.
  5. Consider using the Fortran Package Manager as a build tool, since many Fortran programmers are using it.
  6. Describe what compiler(s) the project can be built with, especially if it uses extensions. Mention external libraries used.
  7. If there are preprints or publications associated with a project, link to them.
  8. Do not post on GitHub proprietary code that you do not have the right to distribute. It exposes you to legal liability and contaminates the code base of people who incorporate your code into their project. If your code uses, for example, Numerical Recipes (NR), either post only the non-NR source files and mention in the Readme that your project relies on NR, or better, find an open-source replacement for the NR procedures used.
  9. Do not use non-standard syntax just for convenience. For example, there are some compilers that accept . instead of % to access a derived type component, but this is non-standard. Deviating from the standard makes your code usable by fewer people.
15 Likes

I would also recommend to provide a CMake build file, since majority of projects still use CMake.

Ivan Pribec’s fmetis repo is a perfect example of how to support both fpm and cmake build files: GitHub - ivan-pi/fmetis: A modern Fortran interface to the METIS graph partitioning library

While fpm is growing in popularity, CMake still remains the primary cross-platform build tool for us, since we also have to build other source files from C/C++ and link to them.

3 Likes

Include unit tests for libraries at least as a confidence test to ensure proper operation, particularly since someone may be building the package in a programming environment very different than the developers (different compiler, different OS, …)

If building with fpm on github add the topic tag “fortran-package-manager” on production-level repositories and make release versions

3 Likes

If you are seeking users, rather than co-developers, adding github, fpm and cmake as dependencies is going to keep the project from finding them. I distribute a Fortran program to economists and policy analysts that would never master all those dependencies. They compile ia single source file with a single 2 argument command I document for them.

It depends if your audience is other Fortran programmers, or the program is a tool for use of other professions. “You can download it free from the Internet” is not an effective reply to the typical user that is missing a dependency.

3 Likes

This is a wrong approach to dependency management.

We can not share only single source files for complicated projects.

In my humble opinion, If someone wishes to be a developer, and does not learn development tools (git, github, fpm, cmake, make etc.), then we should not keep enabling their bad behavior.

3 Likes

Fortran is particularly well-suited for single-file releases. fpm(1) is reasonably complicated and (so far) has a single-file version available for bootstrapping it. fpm(1) is definitely not maintained that way; but does have a release like that. He is not advocating developing using a single file, just providing something that can be built with nothing but a Fortran compiler for users who do not want to be developers. But that leads to truly making a binary package or self-building package via commands such as apt-get, … .

Interestingly, you can use the fpm package like that. Distribute the single-file fpm source for your users to build and your program as
an fpm package and it becomes very easy for someone to build the package; albiet they might need git and curl/wget if it uses external packages.

gfortran fpm.F90 -o fpm
cd mypackage
../fpm install

and I do not know of a common Unix-like system wih a Fortran compiler that does not have make/gmake available so that is generally something easy for the end-user to build with.

So I agree maintaining something as single-source is problematic, but if it is possible a single-source distribution for users is the easiest for a non-developer to work with on the other end. Of course a self-installing binary distribution is even simpler for the end user if the majority of your users use compatible platforms.

In the past some compilers could dramatically improve optimizations when building from a single file; particularly if inlining could improve performance. I suspect that is not as common with more modern compilers but it would be worth trying when optimizing for performance.

I use a plug-in with fpm that turns a project into a single-file source. The fpm package itself contains the script it uses to turn itself into a single file.

Essentially, you do a clean build with a compiler command that is a script that calls the real compiler as usual but also appends the sources to a single file as it proceeds. At the end you get a single-file version of your project if you do not have more than one executable in your app/ directory.

The perspective of code users is unfortunately missed by many computational scientists. If a project has a significant number of non-developer users (and you can’t distribute a binary for whatever reason), then I think doing a usability test for the build process is valuable. If the build process is too complicated, confusing, or fragile, that should be fixed.

Even for computational scientists, using a “modern” build system is often not trivial. My first experience with CMake involved (after considerable time spent) figuring out that I actually needed a newer version of CMake than the latest Ubuntu LTS at the time provided. The developers of the software I was working with could have added cmake_minimum_required to CMakeLists.txt, but they didn’t. (I think they did add that later.) This problem wasn’t that big a deal for me as I simply compiled and installed the latest CMake manually, but this would have been a showstopper for someone with less experience (like many of their users).

2 Likes

I have a different view of CMake based on experiences like @btrettel where the developers couldn’t be bothered to specify a minimal version of CMake for building their project and instead forces you to use the version they had on their development system or tie you to a particular version. Also, CMake might be appropriate for very large projects with hundreds of source files but for small projects with a handful of source files its like hunting rabbits with a howitzer. There are better build systems for small to medium projects (fpm is evolving into one of them). On Windows I think you are just better off copying the source files and setting up your own Visual Studio or something like it project if there are just a few source files to deal with.

1 Like

One of the original goals of fpm that influenced it’s internal design was the ability to create build files for (other?) package builders. Another package ironically of the same name (fpm) but unrelated does something related to that.

That seems to have lost steam but the original idea was you could write your fpm package out for CMake, Make, … . Is that idea off the table now?
That is perhaps best asked in another thread, but mentioning tools that help build a github Fortran repo might be useful. I started a repository that was meant to do just that for fpm projects called “easy” with skeleton CD/CI scripts, examples, and so on but have not had time to keep it up and finish it.

It is gratifying to have several users come to the defense of single file distribution. Indeed, I was not proposing that software be developed as a single file, only that a single file be offered to users that wish that form. I also offer Windows, OSX and Linux binaries (all statically linked) but in my experience government and corporate users are not allowed to import random binaries. I understand that there may be OS compatibility issues with static linking, but so far that has never come up and it would be a complication to ask users to install additional libraries.

…

1 Like

I don’t see why distribution as a single file is important, but I do think it’s nice if instructions can be provided for compiling a code straight from the command line, for example

gfortran -o main.exe kind.f90 mod1.f90 mod2.f90 main.f90

1 Like

I sometimes see Fortran projects on GitHub with proprietary code, most often Numerical Recipes. So another guideline is

  1. Do not post on GitHub proprietary code that you do not have the right to distribute. It exposes you to legal liability and contaminates the code base of people who incorporate your code into their project. If your code uses, for example, Numerical Recipes (NR), either post only the non-NR source files and mention in the Readme that your project relies on NR, or better, find an open-source replacement for the NR procedures used.
1 Like

I have heard there are utilities that teachers use that identify plagiarized prose and give other indications of the odds the prose is original or not. With things like some of the public IA software now available and everyone scrubbing the web is there anything like that for github? It might be nice when the fpm repository goes up if there were something that flagged if code had proper licensing markings; that auto-generated suggested keywords for the packages, and categorized the code to one or more of the software categorizing standards. That would have been a huge undertaking at one point, but that seems right up the alley of something like ChatGPT(?)

Looking at other repositories they often !ook chaotic (and enviously full of packages!). If the packages were categorized automatically if not given a category code by the supporting team that might prevent that; or a hierarchical naming scheme where you could put your packages in a “subdirectory” might work too.

1 Like

Just a little tip for the GitHub users. I just faced a strange situation where my repos where flagged as mainly Javascript or Pascal repos.

Because some of my repos contain:

  • the documentation site (lots of .js and .html files)
  • or include files (I use .inc which is recognized as Pascal)

the ‘Languages’ statistics on GitHub are often wrong.
If you end up in the same situation you can add a .gitattributes file to the root of your Github repository with the following:

*.inc linguist-language=Fortran
docs/** linguist-documentation

The first line maps .inc to Fortran code, and the second excludes the folder docs from linguist search.

6 Likes

Thanks that’s a very interesting trick.

I had that problem with gtk-fortran where .inc where considered to be C or C++ files rather than Pascal (probably because in my case there are lots of C prototypes in comment in those files). As I did not know that trick, I fixed the problem by using the .in extension instead of .inc.

  1. Do not use non-standard syntax just for convenience. For example, there are some compilers that accept . instead of % to access a derived type component, but this is non-standard. Deviating from the standard makes your code usable by fewer people.
1 Like