Here are three recent preprints from arXiv, the last two having some authors in common.
Algorithm xxxx: HiPPIS A High-Order Positivity-Preserving Mapping Software for Structured Meshes
by Timbwaoga A. J. Ouermi, Robert M Kirby, and Martin Berzins
arXiv 13 Oct 2023
GitHub: HiPPIS
Abstract:
Polynomial interpolation is an important component of many
computational problems. In several of these computational problems,
failure to preserve positivity when using polynomials to approximate
or map data values between meshes can lead to negative unphysical
quantities. Currently, most polynomial-based methods for enforcing
positivity are based on splines and polynomial rescaling. The
spline-based approaches build interpolants that are positive over the
intervals in which they are defined and may require solving a
minimization problem and/or system of equations. The linear polynomial
rescaling methods allow for high-degree polynomials but enforce
positivity only at limited locations (e.g., quadrature nodes). This
work introduces open-source software (HiPPIS) for high-order
data-bounded interpolation (DBI) and positivity-preserving
interpolation (PPI) that addresses the limitations of both the spline
and polynomial rescaling methods. HiPPIS is suitable for approximating
and mapping physical quantities such as mass, density, and
concentration between meshes while preserving positivity. This work
provides Fortran and Matlab implementations of the DBI and PPI
methods, presents an analysis of the mapping error in the context of
PDEs, and uses several 1D and 2D numerical examples to demonstrate the
benefits and limitations of HiPPIS.
Stencil-HMLS: A multi-layered approach to the automatic optimisation
of stencil codes on FPGA
by Gabriel Rodriguez-Canal, Nick Brown, Maurice Jamieson, Emilien
Bauer, Anton Lydike, Tobias Grosser
arXiv 3 Oct 2023
Abstract:
The challenges associated with effectively programming FPGAs have
been a major blocker in popularising reconfigurable architectures for
HPC workloads. However new compiler technologies, such as MLIR, are
providing new capabilities which potentially deliver the ability to
extract domain specific information and drive automatic structuring of
codes for FPGAs.
In this paper we explore domain specific optimisations for
stencils, a fundamental access pattern in scientific computing, to
obtain high performance on FPGAs via automated code structuring. We
propose Stencil-HMLS, a multi-layered approach to automatic
optimisation of stencil codes and introduce the HLS dialect, which
brings FPGA programming into the MLIR ecosystem. Using the PSyclone
Fortran DSL, we demonstrate an improvement of 14-100× with respect to
the next best performant state-of-the-art tool. Furthermore, our
approach is 14 to 92 times more energy efficient than the next most
energy efficient approach.
Fortran performance optimisation and auto-parallelisation by leveraging MLIR-based domain specific abstractions in Flang
by Nick Brown, Maurice Jamieson, Anton Lydike, Emilien Bauer, and Tobias Grosser
Abstract:
MLIR has become popular since it was open sourced in 2019. A
sub-project of LLVM, the flexibility provided by MLIR to represent
Intermediate Representations (IR) as dialects at different abstraction
levels, to mix these, and to leverage transformations between dialects
provides opportunities for automated program optimisation and
parallelisation. In addition to general purpose compilers built upon
MLIR, domain specific abstractions have also been developed.
In this paper we explore complimenting the Flang MLIR general purpose
compiler by combining with the domain specific Open Earth Compiler’s
MLIR stencil dialect. Developing transformations to discover and
extracts stencils from Fortran, this specialisation delivers between a
2 and 10 times performance improvement for our benchmarks on a Cray
supercomputer compared to using Flang alone. Furthermore, by
leveraging existing MLIR transformations we develop an
auto-parallelisation approach targeting multi-threaded and distributed
memory parallelism, and optimised execution on GPUs, without any
modifications to the serial Fortran source code.