Omni Compiler for XMP

The Omni Compiler, primarily a Japanese effort, was recently updated on GitHub, so I decided to try it on Windows Subysystem for Linux.

What is Omni Compiler ?

Omni Compiler is a compiler for code including XcalableMP, XcalableACC, and OpenACC directives. The base languages supported by Omni Compiler are C language (C99) and Fortran 2008 in XcalableMP, and C language (C99) in XcalableACC and OpenACC.

Omni Compiler is one of source-to-source compilers that translate from code including directives to code including runtime calls. In Omni Compiler, XcodeML is used to analyze code in an intermediate code format of XML expression.

My installation advice for the Omni Compiler on WSL, based on a few failed attempts, is

If java not installed, run sudo apt install default-jre.
If javac not installed, run sudo apt install default-jdk.
To avoid an error about a missing parser.h file, run sudo apt-get install libxml2-dev libxml2-doc
(I found this advice at linux - Compiler can't find libxml/parser.h - Stack Overflow)

Then the installation instructions at GitHub worked for me.

I know little about parallel programming. Is the purpose of this compiler to run multiple instances of a program in parallel? I can see the use of that for a Monte Carlo program but not a deterministic one, since the instances should give the same result.

Running the xnegloglik program from another thread, with n = 1000000, niter = 100 I get the folllowing timings. Shown first are the timings with gfortran -O2. The last gives results from 4 different instances, with different random numbers used.

$ time ./gf.out
real	0m4.005s
user	0m4.003s
sys	0m0.000s

$ time mpirun -np 1 ./a.out
real	0m4.323s
user	0m4.044s
sys	0m0.036s

$ time mpirun -np 4 ./a.out
real	0m5.002s
user	0m18.724s
sys	0m0.101s

Given the specifications I would say that this compiler can be use to build Fortran programs that use coarrays and MPI to obtain parallellism. While both paradigms result in a single program that is run as multiple processes, these processes can communicate with each other and as they have a distinguishable identity (image number for coarrays, rank in MPI) you can have each process do its own things via that identity.

Omni Compiler is to translate C and Fortran programs with XcalableMP and OpenACC directives into parallel code suitable for compiling with a native compiler linked with Omni Compiler runtime library.

If I understand, it is not a compiler in the usual sense of that term, but a program (a meta-compiler?) that transforms a “Fortan+directives” code into a “Fortran+library calls” code that can be compiled by any compiler. And you need the Omni Compiler library for running the executable. Am I right?

Springer has published a book about it, which is available as a PDF for free:

The XMP parallel programming language directives are influenced by high-performance Fortran, and co-array syntax is also supported. In the about the book it says XMP code is run on the Fugaku supercomputer, which according to the TOP500 list is the fastest supercomputer in the world since June 2020.

From the TOP500 web page:

Supercomputer Fugaku, a system based on Fujitsu’s custom ARM A64FX processor remains the new No. 1. It is installed at the RIKEN Center for Computational Science (R-CCS) in Kobe, Japan, the location of the former K-Computer. It was co-developed in close partnership by Riken and Fujitsu and uses Fujitsu’s Tofu D interconnect to transfer data between nodes.

I wonder which compilers they are using. My brother used to run Fortran on Fujitsu (super-)computers at ANL in Canberra or perhaps at CSIRO.

1 Like

I have found that the Omni compiler greatly speeds up a coarray program from a tutorial by Intel. The codes are in For the program mcpi_sequential.f90, compiled with gfortran -O2, computing pi using 600000000 trials sequentially takes 8.2 s on my machine running WSL. For the analogous program using coarrays, running

$ xmpf90 -O2 mcpi_coarray_final.f90
$ mpirun -np 4 ./a.out

takes 3.2s.

On Windows (not WSL), the sequential program compiled with Intel Fortran

ifort -O2 mcpi_sequential.f90

takes 13.2 s, and the coarray program compiled with

ifort -Qcoarray mcpi_coarray_final.f90

takes 3.8 s.

So Omni is the fastest here, although Intel coarrays are also fast. I wonder if this is generally true.

In my first message I described what was needed to install the Omni compiler, discovered by trial and error. The manual lists these dependencies:

  • Yacc
  • Lex
  • C Compiler (supports C90)
  • Fortran Compiler (supports Fortran 90)
  • C++ Compiler
  • Java Compiler
  • MPI Implementation (supports MPI-2 or over)
  • libxml2
  • make