Parallel Block-Structured class abstraction for FDM-- Ideas

I’m currently developing a FDM/FVM (using contravariant coordinates) code using Fortran and Co-Arrays, and so far I have all sparse matrix (BiCGStab, working on AMG) solvers and data-structure (CCS) libaries developed and working (I think), but so far I haven’t come up with the correct abstraction for the mesh class and, by corollary, the fields and operators.

Said abstraction, I hope, could be a natural extension of a single-block structured mesh which may be treated using simple 2D or 3D (or ND) arrays. Indexing therefore would be natural (i,j,k) both for the fields and mesh nodes/centroids. Parallelization is “trivial” in this context. (Halo/ghost cells).

Clearly, on a multi-block structured mesh context, the natural (i,j,k) indexing cannot be used anymore unless you come up with a clever strategy on how to split the overall multiblock mesh onto smaller “rectangles” distributed across images. I guess one may come up with a “super-mesh” which just extends the computational gridlines up to the outermost boundaries, in order to be able to exploit (i,j,k) indexing. See below:

This idea, although KISS-like, seems to me quite “wasteful” since you may end up with super-grids orders of magnitude bigger than the original one, besides it requires you keep track of your “live/fake cells” making your DO loops inefficient, so I’m not so convinced on this route.

I guess the question is whether it is possible to manufacture a class for block-structured meshes that allows me to use (i,j,k) indexing. My reason for preferring (i,j,k) indexing, instead of more general face-ordering of data (unstructured codes) is that the former is more amenable for Pade-type schemes and translates directly to FDM. My other excuse is that I’m not familiar with graph-splitting (metis, etc) and data-structures for general unstructured (or block structured) codes, but I’m willing to learn if that’s the way to go. I might be wrong and perhaps there’s a more general data structure and graph partitioning that allows me to achieve good parallelization.

Anyway, if you guys may point me towards the correct direction/literature I’d be very grateful :slight_smile:.

Santiago

4 Likes

Not sure what you are asking. Do you want a global i,j,k system comprising all of the blocks. Why bother with that since each block has its own I,J,K system. Just store the data needed to satisfy the transfer of data to/from the layer(s) of ghost cells around each block. For parallelism, you need some form of load balancing algorithm designed to deal with multi-block grid systems. Those have been around for more than 30 years if you check the literature. You could have a global field class that has an an array of a structured grid classes as a component. NASAs CFL3D code is an example of what was at one time the state of the art in structured multi-block CFD codes (GitHub - nasa/CFL3D). Most general use CFD codes have moved to unstructured grids although there are still some applications where structured grid codes can be more accurate (and faster running). Given the speed of current HPC systems and the overall robustness of the underlying solver technology, grid generation is still the major bottleneck in terms of time to solution for a typical CFD analysis workflow

1 Like

Not sure what you are asking. Do you want a global i,j,k system comprising all of the blocks. Why bother with that since each block has its own I,J,K system.

No, That’s not my question, I understand that’s not an effective extension of a classical structured single block (i,j,k) addressing for CFD, it was just an exposition of my thought process until now. Have in mind that I start from structured, curvilinear, CFD solvers. Partition strategies there are ‘trivial’.

For parallelism, you need some form of load balancing algorithm designed to deal with multi-block grid systems.

Exactly! This is what I need (I guess). Just one additional condition: the partitioned sub-grids have to be structured. Problem using METIS is that it seems I cannot get a partitioned structured mesh from it. perhaps I’m wrong.

Once I have the partition figured out, managing the local 3D arrays is no biggie.

Those have been around for more than 30 years if you check the literature (…) NASAs CFL3D code is an example of what was at one time the state of the art in structured multi-block CFD codes.

I dont disagree there’s literature out there for this topic; the problem is of relevance, portability, and validity. I was hoping to find, like you do for linear-solvers or unstructured partitions, some sort of library or basic algorithm where to start (LAPACK, PetSc, METIS, SCOTCH, etc). By the way, literature regarding parallelization and partitioning strategies for multiblock structured meshes are not as available as one may think (Peric’s or Anderson’s barely mention anything about it and most articles are pay-walled or too old to be digital), and tend to be quite terse for the non-CS specialist.

My question, I guess, using this newly learned lingo is: Is there a reasonably simple/useful (or already implemented FOSS library) partition algorithm that I could use for block-structured meshes that produce local partitioned structured meshes?

There is a heuristic algorithm originally developed at LLNL by Bill Crutchfield that I’ve used in the (long ago) past. See Crutchfield. " Load balancing irregular algorithms", Techincal Report UCRL-JC-107679, Lawernce Livermore National Lab. 1991.

There is a more recent paper by Bell et al, “Parallelization of structured, heirarchical adaptive mesh refinement”, Computing and Visualisation in Science, August 2000 (paper is available at https://www.researchgate.net/publication/225673373. That gives pseudo code for the algorithm.

For most multi-block grid systems you usually create each block separately. I think what you are asking for though is a tool to subdivide an existing structured grid into smaller blocks. There is a tool that comes with CFL3D that does that (I think).

2 Likes

Welcome to Fortran-lang Discourse @slopezcastano.

You might find the following libraries (and their respective documentation, publications) useful.

For meshing:

All of the above libraries use parallel-enabled data structures.

In terms of graph partitioners and load-balancing there are many great libraries out there. During my PhD days I used a lot:


To experiment with the above packages probably one of the easiest ways is through PETSc itself, since you can download, configure and compile them quite trivially with something like

./configure \
--with-debugging=0 --with-petsc-arch=LB \
--COPTFLAGS="-O3 -g -march=native" \
--CXXOPTFLAGS="-O3 -g -march=native" \
--FOPTFLAGS="-O3 -g -march=native" \
--download-metis=1 --download-parmetis=1 \
--download-ptscotch=1 \
--download-zoltan=1 \
--download-moab=1
3 Likes

One of the possible options is to subdivide each block of the multi-block mesh into a number of small tiles (8x8x8, for example) and then distribute these tiles across mpi-processes.

Space filling curves (Hilbert, Peano) can be used to reduce 3d/2d load-balancing problem to the 1d one.

1 Like