Reviewers and maintainers needed for stdlib

We are looking for code reviewers at stdlib, if you can help out, jump over to the open PRs, select one you find interesting and submit a review.

Stdlib is one of our central projects in Fortran-lang. So far development at stdlib has been rather slow-paced since we have problems finding enough reviewers for the open PRs, with some being open since three months now without getting much feedback. Even for small patches review times are currently measured in weeks, which can make the overall process of contributing rather frustrating.

I briefly discussed the issue with @certik and @milancurcic, one promising suggestion was to setup maintainer teams for all our central projects, which would both empower maintainers, and make them accountable for their respective projects.

Still, beside having dedicated maintainers for stdlib, projects live from their community and the discussion in the PRs at stdlib has been very fruitful. I can only encourage to take up one of the open PRs and help reviewing it, it is very rewarding.

5 Likes

I’d like to help out with the statistics part, but I’m not sure what is needed. Would there be any value to comparing the output of the algorithms for probability distributions implemented in stdlib with the same distribution implemented in R?

2 Likes

Thanks, much appreciated. I usually start by looking over the specification for reviewing the proposed API. The examples and tests are also a good starting point. Testing against reference or external implementations sounds like a good strategy as well if you have access to those, it also gives a first hand experience with proposed API to find out any rough edges.

Feel free to ping me on any of the issues or PRs at GitHub if you have questions or need help.

1 Like

I wonder if one reason for the lack of reviewers is that, technically, it’s not entirely straightforward interact with the pull requests?

This was my experience, having occasionally reviewed for stdlib pull requests. Despite having used git to manage my own codes for many years, the review process requires some additional steps that took a little time to figure out (and there are still parts that I don’t understand).

So here’s my step-by-step guide to being a reviewer. It’s incomplete and might be outright wrong in places. But it would be great if we can refine it in this thread, and then make the information clearly available somewhere.

How to review a pull request in stdlib – INCOMPLETE DRAFT

Make sure you have an account on github - if not, go to www.github.com and sign up.

On this page you can browse pull requests that you might be able to review.

There are a couple of ways to review and comment on pull requests:

  1. The easiest is to use the github page for the pull-request to browse the code. You can use the Commits and Files changed tabs to look at the changes. Based on this you can make comments in the Conversation tab. That’s helpful to sort out high-level issues, but won’t show up as a formal “review”.

  2. Typically it is useful to work with the code itself, at least for non-trivial pull requests. See below for how to copy the code to a branch of your local stdlib repository.

  3. I have seen other people suggest changes to pull requests that can be merged via git. I don’t know how to do this, but instructions would be useful, see placeholder below

  4. Finally you can make a review via the github web page. To do this go to the webpage of the pull request, and click on the Files changed tab. If you are logged into github, note in the source-code display you can click on the code and make comments. To do the review you should make a bunch of comments on the code, and then use the Review changes button at the top-right to submit a review along with comments. Then you’re done.

In doing a review it is a good idea to compile the code, run it, and check a few cases. However IMO as a reviewer you shouldn’t feel obliged to write an extensive test suite. Rather, you should confirm that the test-suite provided in the pull request is sufficiently convincing. If it isn’t, by all means suggest new tests. If you have done them and can provide actual code, you can present and discuss it via the Conversation tab.

How to copy the pull-request code into a branch of your local stdlib repository

To do this you first need a local copy of the stdlib git repository. If you don’t already have it, running git clone https://github.com/fortran-lang/stdlib.git in a terminal will copy it into your working directory. If you cd into the stdlib directory and type git-branch, you will see you have the master branch.

To look at the code for the pull request, you can make another branch that contains it. To do this, notice there is a number associated with each pull request, e.g. from the website we see this one https://github.com/fortran-lang/stdlib/pull/272 has number 272. To download this code into a new branch of my local copy of stdlib, I do git fetch origin pull/272/head:BRANCHNAME where BRANCHNAME should be replaced with the name of the new branch I want to make (e.g. I might use pullreq_272 in place of BRANCHNAME). Best if the chosen name does not match any of your existing branches.

After running the git fetch ..... command above, nothing appears to change. However git branch will show that you now have the new branch in addition to the master branch. The command git checkout BRANCHNAME will then take you to that branch (again, replace BRANCHNAME with whatever you named it) . At this point you can look at the source code, compile it, etc. At the end of that, you can get back to the master branch with git checkout master.

How to make changes to the pull request code, and then request that the original author merge them into their repository.

I don’t know.

How to merge changes to your pull request suggested by reviewers with the above.

I don’t know. We had trouble with this in the sorting review.

8 Likes

I’d be happy to review some contributions. I’ll have a look at the open pull requests. Thanks to Gareth for the workflow overview as well!

1 Like

That is actually quite true, the GitHub web interface takes some getting used to. Having used it for a while now with stdlib and other projects as well, I found that it is not scaling well once we hit about hundred comments in the discussion, which is not that unusual for reviews at stdlib.

There are two ways I’m usually using.

Making comments with suggestions

You can open the Files Changed in the pull request and create a new comment for a single line or a range of lines. Comments for multiple lines can be created by dragging line from the blue plus sign selector.

Suggestions can be added by clicking the left-most button or by adding in the comment:

```suggestion
    ...
```

The suggestion can be committed by the pull request author in the Files Changed tab or directly from the Conversation tab. To synchronize local branches the PR has to be fetched from the fork again.

For more details see the GitHub docs.

Creating a patch

For larger suggestions you can create a local commit and push it to your fork of stdlib:

❯ git remote -v show
upstream	https://github.com/fortran-lang/stdlib (fetch)
upstream	https://github.com/fortran-lang/stdlib (push)
origin	git@github.com:awvwgk/fortran-stdlib.git (fetch)
origin	git@github.com:awvwgk/fortran-stdlib.git (push)
❯ git push --set-upstream origin BRANCHNAME

Go to your fork of stdlib and open a new pull request. To suggest the changes to an existing pull request, change the base repository from fortran-lang/stdlib to the repository of the original author and select the base branch in their repository which is used in the PR. This will create a PR inside the original PR.

Make sure to mention the original author explicitly in such a PR since must users are not actively monitoring their forks on GitHub but only the upstream repository. The original author can review your changes and merge them in their branch. Afterwards they will appear in the original PR.

3 Likes

I’ll admit I am guilty of not reviewing my fair share of PRs, although we don’t seem to be having quite as big a problem with the fpm repo. I could do a better job monitoring things, and we could ask others to do so as well, but I don’t think that is a sustainable long term solution.

I applaud @gareth and @awvwgk for putting together some good instructions for reviewers. This is certainly a good thing to help empower new comers to get started reviewing PRs. But I don’t think this is enough.

We should also empower and encourage contributors submitting PRs to request reviews from specific people. Would it be feasible to put together a list of potential reviewers and their “areas of expertise” so that contributors know who they can/should request reviews from?

In my experience, open-ended requests to the community are far less likely to receive meaningful responses than targeted requests. Even if the person you request a review from doesn’t have the time, expertise or confidence to perform the review, they may at least feel obligated to suggest an alternate.

3 Likes

Yes. And we note this in the Workflow:

When opening a PR, request reviews from one or more people that are most relevant to it. These are likely to be people involved in prior steps of the workflow. Other contributors (not explicitly invited) are encouraged to provide reviews and suggestions as well.

So if we follow the workflow and first discuss a feature in an issue, this helps sort out contributors who are both interested and have sufficient expertise to help.

Granted, we haven’t been following the proposed workflow. I think that suggests that it’s not well designed. There are advantages to discussing via PR–the biggest one being is that it’s easiest to demonstrate a new feature with its implementation, so people can see it in the “flesh” and try it out.

2 Likes

Thanks for starting this Sebastian @awvwgk. I think having dedicated maintainers is a good idea and I’d like to repeat the open invitation to everyone here to get involved where possible. Even a few additional reviewers for stdlib and fpm would go a long way to helping the projects move along.

Brad’s suggestion for targeted review requests is a good idea and somewhat similar to having ‘code owners’.

Unfortunately with stdlib, the nature of the contributions at this early stage are that they are typically very large and this has been the main limiting factor for myself in finding time to properly review the PR in its entirety. I will endeavour to review more stdlib PRs where my expertise allows me and I’ve set up a new email rule to keep Github ‘review requests’ in my inbox and to flag them (otherwise they get lost in all my other Github notifications).

1 Like

All we need is to find a person or a team who is interested in being in charge of reviews for stdlib. Then that person either reviews the PR, or finds a good reviewer and ensures that PRs are reviewed on time.

So that is what we should focus on, find a person or a team.

The exact rules or procedures then become secondary, as the person(s) in charge will iterate on what process works the best for them (and us as a community).

2 Likes

Thank you @awvwgk to start this thread.
I try to review as much as I can, but time is currently a limiting factor for me. Even if people would request me explicitely to review a PR (as proposed in the Workflow), I may not find the time to do it properly. And, as @lkedward mentioned it already, the size of the PR is also often an issue.

Having a person in charge might solve the issue only partly. E.g., in my case, a person in charge asking me to review a PR will not help me to find some time to allocate to this task :frowning: .

2 Likes

I’m going to need to do a bit of reading, but I was planning on comparing the Fortran standard library implementations to:

  1. R
  2. Octave
  3. GNU Scientific Library.

I don’t see why there couldn’t be a large, shared test case database for different projects to verify the implementation correctness of the algorithms. Does something like this already exist?

2 Likes

Thanks for the feedback everybody and also thanks to all the code reviewers at stdlib. I added this to the agenda of the next monthly call to discuss it in more detail, especially with a focus on a long-term solution.

To address the documentation of the review workflow, I opened an issue at the stdlib repository:

I think a wiki page might be the best solution to make this a community resource that can be easily improved.

5 Likes

I have a suggestion, I hope stdlib will support fpm on the next agenda. Based on the future stdlib need be compatible with fpm, and I learned through observation, fpm supports stdlib in roughly two ways:

  1. fpm to support fypp (see stdlib compatible with fpm ¡ Issue #279 ¡ fortran-lang/stdlib (github.com))
  2. fpm to support pkg-config (see Fortran packages using pkg-config ¡ Issue #445 ¡ fortran-lang/fpm (github.com))

For users and developers, the use of make and cmake will bring about some efficiency problems (see [FPM] add fpm support by zoziha ¡ Pull Request #437 ¡ fortran-lang/stdlib ¡ GitHub and LKedward/stdlib-fpm: Auto-generated mirror of the fortran-lang standard library structured as an fpm package (github.com)), the so many configuration files of make/cmake will make developers dizzy.
fpm is now very easy to use :star_struck:, I very much hope that stdlib will support fpm on the next agenda, which can liberate developers and users productivity, which will also bring greater visibility and more maintainers and reviewers to stdlib.
As far as I know, many of my friends use windows + visual studio + intel visual fortran as a development tool. At this stage, it is difficult for them to notice and use fpm and stdlib. We can gradually lower the threshold for participation in stdlib.

5 Likes

I think it would simplify the development of stdlib if one could simply compile it directly via fpm and develop it using fpm.

The native solution seems to be for fpm to support fypp. To avoid any Python dependency, the fypp pre-processor should be developed in Fortran (as an fpm package) and fpm should just depend on it.

So how difficult is to develop the fypp pre-processor in Fortran, as a standalone fpm package? The Python version is a single file: fypp/fypp at master ¡ aradi/fypp ¡ GitHub, which is 3000 lines, but we might not need all the features (at first) in order to be able to compile stdlib.

However, why don’t we first simply require fypp to be in path, and execute it from fpm? Yes, it would be fragile for people, but at least it would get us started. Then in the second phase we implement fypp directly in Fortran, so that fpm is standalone.

2 Likes

Note that fypp also supports defining Python expressions, meaning we would also need a Python interpreter, or have to limit ourselves to a subset of the fypp preprocessor language. But you are right, we can probably implement only what we need first.

I agree that integration with fpm is the next significant step in tidying up and lowering the barrier of entry for this developing Fortran ecosystem.

FWIW, I don’t think that adding the temporary Python dependency to fpm is that big of a deal for the user. Python is becoming ubiquitous and a simple pip3 install fypp command is an absolute breeze to get stdlib up and running with fpm.

I do also think that translating fypp to a native fortran fpm package would be awesome. How do you envision that process? Would we replace the Python for loop directives with do? That would require complete rewrites of .fypp source code, which could be a huge headache. Would we keep the original python directives in source code, but handle those under the hood with fortran? I would be interested in contributing to a native fortran version but I wouldn’t even know where to begin.

Unfortunately that simple pip3 install fypp command does not “just work” for me (it never did, neither on Linux nor macOS):

$ pip3 install fypp
Collecting fypp
  Downloading https://files.pythonhosted.org/packages/30/ac/a44c1421206d658765393911524ded650e8ae42190b426b27b5b76ef6e1f/fypp-3.1.tar.gz (68kB)
     |████████████████████████████████| 71kB 1.9MB/s 
Building wheels for collected packages: fypp
  Building wheel for fypp (setup.py) ... done
  Created wheel for fypp: filename=fypp-3.1-cp38-none-any.whl size=28565 sha256=bdb8e93be66914ee225da3fc948efda8a6c377cc52556730dd4f518d670c6010
  Stored in directory: /Users/certik/Library/Caches/pip/wheels/b1/b0/e3/2dbd7b83bfce8c37f23f9bc260ca8ff7006d1078758c733f75
Successfully built fypp
Installing collected packages: fypp
ERROR: Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: '/Library/Python/3.8'
Consider using the `--user` option or check the permissions.

WARNING: You are using pip version 19.2.3, however version 21.1.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

This is a common problem with Python. With the --user it would probably work, but then you need to fiddle with your PYTHONPATH etc. I just use Conda these days, that works (and I install fpm via Conda also). But then you have to install Conda first.

So the fpm Conda package can just depend on fypp and it should work on all platforms. That might be the best way forward in the short term.


In the long term, let’s brainstorm what Python features we actually use in fypp. Do we even use any Python expressions?

Dang, sorry to hear. I made the foolish assumption that if it works for me, it works for anyone else.

Off the top of my head I know that under common.fypp we use Python expressions to store the main generator loop variables like REAL_KINDS_TYPES, CMPLX_KINDS_TYPES, etc. Additionally, we define the python function rname to generate procedure names. And of course, when using for loops with REAL_KINDS_TYPES we use the python syntax:
for k1, t1 in REAL_KINDS_TYPES
to iterate through the list of tuples, assigning k1 to the first item in that iteration’s tuple and t1 to the second item in that iteratation’s tuple. Even that simple construct which is used all over (my) .fypp source code doesn’t seem like a trivial translation.

1 Like

Based on fpm is one of the cornerstones of the Fortran ecology, if it can gradually support meta-programming technology from expansion, it will become a very cool build system, but this will increase the burden on fpm.

At this stage, I have implemented a mode that supports fpm and fypp in forlab:

  1. Write *.fypp source codes in the mate-src folder;
  2. Use the make tool, call fypp to process the source code in the mate-src folder, and generate *.f90 to the src folder;
  3. Use fpm to link the *.f90 source codes in the src folder;
  4. Use fpm to test the test codes in the tests folder;
  5. Submit the source code both in the mate-src and src folder to the git repository.

The advantage is:

  1. Developers use make(+fypp)+fpm to simplify the development process;
    Just make&&fpm test
  2. The user uses fpm to compile the *.f90 files from the src folder;
    Just fpm build
  3. You can add any push submission without worrying about writing mistakes or missing the contents of the dizzy makefile/CMakelists.txt.

The weakness is:

  1. Destroy the existing make/cmake development mode;
  2. There are both meta-programming codes in the mate-src folder and *.f90 codes in the src folder, which increases the size of the stdlib repo.
1 Like