High Performance Fortran (HPF) history and lessons

I found this article and the associated Hacker News (HN) discussion about High Performance Fortran (HPF):

And I think there are many lessons there that we should learn from as we are moving Fortran forward.

Here are some things I learned from the article:

  • The importance of having a good modern Fortran compiler under our control that we can use to develop and experiment with new features. We will have LFortran and Flang that the community can contribute to.

  • The first comment at HN says: “But the productivity boost of a REPL is undeniable and I use Julia mostly now.” I 100% agree, and we have a fix: LFortran provides a REPL for Fortran.

What do you think are some things we should learn from HPF?

4 Likes

For convenience, this is the original article page (which we can reach by clicking the title in the HackerNews page above):

1 Like

Thanks for this link. I had run across a reference to the article before, but the link I found didn’t allow me to read the whole thing. I have been writing a blog post on FORALL vs, DO CONCURRENT and included the history of HPF behind FORALL.

As someone who worked on an HPF compiler (DEC Fortran 90), I agree with the authors’ belief that it was both ahead of its time and insufficiently expressive. The lessons learned with HPF influenced the development of DO CONCURRENT and the locality spec enhancements in F2018. Of course, the continued development and wide acceptance of OpenMP also played a significant role.

The lesson I took from the ultimate failure of HPF is that trying to “bake in” to a language features tied closely to current hardware architectures is likely to end up being a wasted effort that becomes irrelevant over a short timeframe. Over my career I have seen multiple “go faster” hardware trends come and go - I consider the current attention paid to GPUs to be in this class. A few years from now, GPUs will likely be forgotten. A few years ago, there was a proposal to add a “FASTMEM” attribute to Fortran, as there were some current hardware implementations of that. It was rejected, and the concept of a separate class of “fast memory” soon started fading from current hardware.

Experimenting with language features in a compiler can help reveal problems in the specification, but I’m skeptical that it sheds any light on how useful it is to application development.

6 Likes

Thanks @sblionel. I agree with the general idea that we do not want to bake into the language references to current hardware.

At the same time, we want to run on the current hardware, efficiently.

The approach that I am offering (I don’t know if that is the best way, but it’s one of the ways):

  • Get some Fortran compilers running efficiently on the current hardware (GPU) using some prototype changes, either language extensions or pragmas, or improvements to the compiler optimizations / backends, etc. Whatever it takes to match other languages (such as C++) in performance on GPUs from Fortran.

  • Have a discussion with the community and the committee what we want to do

  • Based on the previous point, have recommendations for users how to run on current hardware efficiently

I think we are all in agreement we should only standardize things that can last a long time, and will not go out of fashion in 10 or 20 years.

1 Like

Based on my HPF experience with customers (Cray used to support the PGI implementation on the T3E) the cumbersome data distribution model was the ultimate nail in the coffin. It was especially hard to understand what happened at call sites with arguments that were distributed arrays. The SPMD model of MPI (and Fortran since 2008) was a lot easier for users to understand. Ultimately, the problem was not a botched implementation (PGI implemented the spec faithfully), but that the model was too hard to understand for programmers who had not participated in creation of the model, and it assumed that the only thing you wanted to do in parallel was operations on distributed arrays. The SPMD model was both more flexible and easier to understand.

1 Like

Indeed, my reaction as well. Gfortran is THE open-source, community compiler, and it is viable as a production-level compiler for serious users. FLANG shows similar promise down the road.

1 Like

Do you have some evidence that this is needed? Many Fortran compilers already support GPU programming, through OpenMP, OpenACC and other methods. Which C++ features related to GPU programming are missing from the Fortran ecosystem?

@sblionel yes, the evidence is several codes around me moving to C++ precisely for this reason. The main feature missing is to support GPU of the major vendors with the same code base. Each of the alternatives such as OpenMP, OpenACC, HIP, Cuda, …, typically works well on a GPU of one vendor, but not a GPU of another vendor. In your experience, what would you recommend for users to use in Fortran production codes? If you tell me what you recommend, I can bring the specific objections. In C++, the method of choice for the codes around me is the Kokkos library that runs on all GPUs and multicore CPUs with the same C++ code.

@kargl indeed, I use gfortran as my main Fortran compiler. Unfortunately, as you noted, for multiple reasons it seems hard to create a community of contributors to gfortran, or to use it as a basis for an llvm based compiler, and that is the motivation for the new compiler efforts.

@kargl thanks for clarifying. I agree that the “community of users” is not the same as the “community of developers” and that it is hard to attract developers.

@certik, would you please identify specific language features that have been added to C++, related to GPU programming, that you feel are missing from Fortran? I want to understand what sort of features you’re talking about. That applications have moved from Fortran to C++ is a fact, but I don’t see that it’s due to a lack of GPU support in the standard.

OpenMP and OpenACC address vendor portability at the source level, with the limitation that some compilers support, as you say, only their own hardware. Intel’s OneAPI claims to support multiple hardware targets, so that may be a future option. I don’t see how adding a language feature would naturally result in multivendor support by a given compiler.

1 Like

@sblionel I don’t know yet if we need to add some language features to better support GPUs. You are correct that there are always multiple reasons why an application moves away from Fortran. But one of the main reasons that I see around me is GPU support. I don’t think that there is any one language feature in C++ that enables that: but it’s fact that C++ allows to write libraries like Kokkos, which in turn allow to write multiplatform code. In Fortran, the natural way is to get good support for do concurrent (which is in the language itself, which I think is the preferred method for Fortran) and then whatever else is needed. OneAPI looks interesting, but it’s relatively new.

Speaking of Intel OneAPI, the Intel website says:

The programming language for oneAPI is Data Parallel C++ (DPC++) and employs modern features of the C++ language to enact its parallelism. In fact, when writing programs that employ the oneAPI programming model, the programmer routinely uses language features such as C++ lambdas, templates, parallel_for, and closures.

This would seem to imply that people at Intel believe C++ is a more versatile language for heterogeneous computing, given the amount of work they have put in to produce oneAPI.

I love how the oneAPI documentation also offers the following tip:

TIP: If you are unfamiliar with these C++11 and later language features, consult other C++ language references and gain a basic understanding before continuing.

I don’t think I will go down this path anytime soon. I will rather wait for a Fortran solution to come along.

2 Likes

Intel is working on a oneAPI Fortran. I think the problem @certik refers to is that vendors of accelerator hardware focus their support on C++ because it is a more popular language. It isn’t due to a defect in Fortran nor a language superiority for C++. This is why I am skeptical of claims that we have to do “something” to better support GPUs in Fortran. There has been a long-term shift away from Fortran, but “fads” in programming languages come and go. I think we need to focus on the strengths Fortran provides and to make it more productive to use Fortran. The work on generics is a large part of this. I’m doubtful that we need to spend much energy on further refining DO CONCURRENT, etc.

We have quite a few WG5 members with extensive experience in applications and compilers, who have guided our direction over the years. Vendors such as Cray, AMD, IBM, NVidia and Intel are well acquainted with what their customers are asking for.

4 Likes

@sblionel for me “fixing Fortran” doesn’t just mean the “Fortran standard”, but rather the whole ecosystem of tools, libraries and compilers. It might be that we do not need to modify the language itself too much, but we need to improve the tools and other things so that Fortran users are not forced to migrate away.

4 Likes

@certik then we need to identify what needs improving. It doesn’t help to just hand-wave about this. We need specifics.

I agree this is the way to go, as opposed to trying to define anything in the language or standard a priori. I also think this is quite specific to start.

Currently, some compilers support compiling Fortran code for some GPUs. We need an open source compiler that will compile for most GPUs.

I think there’s an implied question here: Why can’t we build on the effort of gfortran which is already a mature compiler and why do we need a new open source compiler?

As I understand it, one argument is for building with LLVM, which provides mature tooling for writing compilers.

Another is that, as many of us here agree, that Fortran needs an interpreter/REPL to survive and grow in the long run. I don’t know this myself, but I hear from others that this is considerably less difficult to do with LLVM vs. gcc.

What else? Why can’t we do this with gfortran?

Though I cannot explain it very well (because of my lack of CS knowledge), I strongly feel that the basic difference between Fortran and C++ (or other languages including Python) is that Fortran (or the committee) disregards the importance of irregular data structures, and only focuses on regularly structured data. Indeed, Fortran still has only an “array” as data structure, and nothing else. As a result, we (the user) need to express everything in terms of that “array”, which makes the coding very verbose and tiring. Those “arrays” even do not have the push() or append() method yet in a type-agnostic manner (here, arr = [arr, elem] is not performant, I think). For applications with regular grid structures, this may be very advantageous by making the programming easier, but for other applications, Fortran does not become an option because of the lack of various data structures (which is my current “impression” or assumption). ← My opinion in this part is a bit different from Ondrej and other people, I guess… In other words, Fortran lacks “smart objects” in the following quote (but my interpretation of the quote may be wrong; sorry in that case…)

Rob Pike’s 5 Rules of Programming (The corresponding HackerNews here)

Rule 5 is often shortened to “write stupid code that uses smart objects”.

My another concern is that: do the WG5 members also have experts in fields that focus more on irregular data structures (including machine learning?)? Aren’t the members possibly dominated by CFD communities? I am afraid there might be some selection bias (for both committee and user sides) in designing/choosing/requesting language features.

4 Likes

@sblionel and any and all readers working on orgs such as WG5 or in the Community toward an official / unofficial charter to advance Fortran as a language, please pay attention to this note and read carefully.

First, Fortran language bearers need a vision to make Fortran the lingua franca of scientific and technical computing. Toward this, as mentioned by @certik and others here and elsewhere, not only processors but the entire tooling options and ecosystem are critically important. Toward this, everyone should ask themselves: how inconvenient or difficult is it to develop such tooling using the Fortran language itself? For example, can one productively bootstrap an entire Fortran compiler and continue to enhance and maintain it using a base language set e.g., Fortran 2018? If not (and the answer in year 2020 is indeed that it is not), that has gotta be a realization there are serious deficiencies and those gaps need to be addressed. The vision has to a language such that everyone, at least in principle if not always in practice, can do all the needed tooling using the language itself.
Another crucial aspect of the language vision is to consider library development. Libraries are of utmost importance to all domains, but especially so in scientific and technical computing. It gets to the core of how science, especially the third leg now which is computational science to complement theory and experiments, advances - brick-by-brick and upon the shoulders of others. How easy and convenient is to develop libraries in Fortran that can used effectively and productively by the broader scientific and technical community? In year 2020, the reality is it is extremely difficult, impossible even in some situations. This extends to both standard libraries as well as other community / domain-specific libraries. The issues here are another indicators of where the language needs to improve.

With the above vision in mind, here’re the specifics for the base language itself to improve library and toolset development:

  1. Unsigned integers or bitset type,
  2. Proper enumeration type along the lines of Swift (preferably), or at least C++ / D,
  3. Improved support for OO paradigm including easier static casting and to be able to SEAL classes (mark them as final)
  4. Improved Generics and template programming including utilities that enable robust new containers and algorithms. This includes facilities such as iterators and being able to overload “operators” such N:M for array sections.
  5. Anonymous functions (lambda operations) and closures,
  6. Exception handling

The above list is not too long, it can be achieved with some focus and attention, especially toward the vision.

If this is achieved, it will truly open up Fortran to a new world of libraries (a la Kokkos in C++ for GPUs) and it will enable Fortran to remain agnostic to hardware and other combined hardware-software trends.

7 Likes

I do not agree that one should be able to bootstrap a Fortran compiler in Fortran. Fortran never was intended to be the “one language for all things”. Fortran is good for some things (scientific programming) and less good for other things (low-level system programming). I am not in favor of diluting Fortran’s strengths in an attempt to draw in uses more appropriate to other languages.

I am in favor of making Fortran programmers more productive in their development of applications where Fortran’s strengths lie. I especially encourage ways to make Fortran libraries more useful, though there is a historical obstacle that such libraries must be built with a specific Fortran compiler in mind. (This is not the fault of the language.)

8 Likes