GSoC'22: Accelerating Fortran DO CONCURRENT in GCC

Hi everyone,

I’m submitting a proposal for GSoC 2022 to work on DO CONCURRENT GPU offloading support in GCC. The original project idea is listed here:

There are three main goals for this project.

  1. Add support for the Fortran 2018 locality specifiers (LOCAL, LOCAL_INIT, SHARED) and the DEFAULT clause in the GCC Fortran parser.
  2. Add support for the upcoming Fortran 202X REDUCTION clause in the GCC Fortran parser.
  3. Implement both CPU-based and GPU-based parallelization of DO CONCURRENT as controlled by the new -fdo-concurrent=… compiler flag. The possible options to this compiler flag are:
    • serial - no parallelization.
    • parallel - pthreads-based CPU parallelization similar to/based on the existing -ftree-parallelize-loops=n compiler flag.
    • openmp - CPU parallelization using the OpenMP programming model.
    • openmp-target - GPU offloading using the OpenMP programming model.
    • openacc - GPU offloading using the OpenACC programming model.

Personally, I want to learn not only to break compilers, but also fix them.

The email to the GCC mailing lists can be accessed here:

What do y’all think about this?


@wyphan , that’s absolutely brilliant! Hopefully many readers with compiler developer experience can provide you with useful and actionable technical feedback that helps you with your specific project ideas and proposal, but from my perspective I can’t cheer you enough on your initiative and effort toward this!


I’m pleased to announce that my GSoC '22 project proposal has been accepted!

I’ll periodically post my progress here and on my personal website .
Feel free to reply to this topic for further discussion. I’ll try my best to check this thread every so often :grin:


Congratulations @wyphan!

Will the support of GPUs be limited to any particular vendors? If of any help, I can volunteer to run tests on a NVIDIA RTX 2060. In a recent technical blog from NVIDIA, Developing Accelerated Code with Standard Language Parallelism, showed that do concurrent loops could reach exactly the same performance as the OpenACC and OpenMP versions with GPU offload.


Thanks @ivanpribec !

Will the support of GPUs be limited to any particular vendors?

I envision it would initially support NVIDIA and AMD. For the latter I intend to apply for access to AMD Accelerator Cloud:

If of any help, I can volunteer to run tests on a NVIDIA RTX 2060

I just purchased a NVIDIA Quadro RTX A2000 6GB off eBay, which just arrived yesterday. But I will appreciate your help to test it on RTX 2060 too, since that would be a different generation (Turing instead of Ampere). I’ll PM you for more details.

Lemme tell you a little secret: I actually consulted with Jeff Larkin (the author of that technical blog post) before I submitted my GSoC proposal! If he didn’t indicate a “green light” to me, this whole thing wouldn’t have happened… :joy:

The DO CONCURRENT implementation in nvfortran is stellar. It is even able to interoperate with OpenACC. For more details on this, I’ll refer to the arXiv preprint by Stulajter et al.:

If you’re a visual learner, that paper was presented during NVIDIA GTC 2022, which you can watch here (requires login):


This recording was published three days ago on the OpenMP youtube channel:


Congratulations, Wil :slight_smile: Nice work!

1 Like

This is nice, but it seems nvfortran does not support coarrays yet, at least based on this dicussion. If nvfortran or flang in the future could support coarray, with do concurrent it already supported, it might be possible to write a code that is (1) both multiprocessing and multithreading, (2) standard conforming and (3) portable (on CPU or GPU clusters) all at the same time!

@ivanpribec Yes, I attended that session in person, and I suggested that the meeting get recorded… Glad to know the recording is now available on YouTube. :grin:

@Arjen Thanks! I have much to learn on how to fix compilers.

@han190 That is correct, all compilers that are based on classic Flang (NVIDIA nvfortran, AMD AOCC flang, Arm armflang) do not support coarrays. I asked a question during the ECP BoF days session on LLVM whether there are plans to add support for coarrays, and looks like it’s not a priority…

GSoC '22 Blog post 0: GCCprefab – a relatively easy way to build GFortran

This weekend marks the end of the community bonding period for GSoC '22, and here’s my progress so far with the project.

I met with Tobias Burnus, one of my mentors, over a MS Teams call on May 30, 2022. Together, we picked GCC PR# 102003 as a good starter issue to start delving into the Fortran parser in GCC. He also guided me through how to debug the compiler using gdb.

In the meantime, I’ve implemented a simple build script system for GCC that I christen GCCprefab. Before this build system existed, there are only three relatively easy ways to build GCC painlessly:

  1. Using Spack package manager: spack install gcc
  2. Using the install script for OpenCoarrays
  3. Using jwakely’s build script

The name pays homage to prefabricated buildings such as sheds/barns (even sections of houses) that are commonly sold in the US at hardware stores such as Home Depot or Lowe’s. It’s arguably an overengineered solution to my laziness having to memorize all the different configure flags when building GCC from sources.

Right now, GCCprefab has the following features:

  • One single script written in Bash
  • “Eats” a config file with a custom format inspired by Spack spec syntax and the Windows INI / TOML format for configuration files
  • Clones the main GCC Git repo, or a custom mirror of your choice
  • Upon execution, logs standard output for each phase of the build process into a timestamped log file, which is xz-compressed after each phase completes successfully
  • Licensed under the Apache 2.0 license

To try it out, you can head over to my GitHub repo. Please feel free to open an issue there if you found a bug undocumented feature or to suggest new features. Pull requests are welcome too!