The only code that I’ve worked with that uses high order stencil schemes (CROCO-ocean, which is based on ROMS if I recall correctly) adjusts the size of the halo in order to have all the data necessary for the stencil. Basically, you set the halo width as a variable and write the MPI exchanges for a possible variable halo width. I had to implement this procedure inside NEMO-ocean.
I still have to dig into the paper (see section 5.2 and 5.3) and code, but the latest implementation of oceananigans.jl uses a similar approach, it gathers a halo that is very large and integrates for several time steps, each one reducing the size of the halo, until the interior domain is reached. There, a new halo exchange is performed. To my recollection, that code is the only one adopting this strategy in the ocean modelling community, and the procedure is only adopted in the so called “barotropic sub stepping” which is basically a 2D shallow water model inside a 3D code.
Concerning the hijack of your thread:
my next project is to implement the suit of Arakawa grids, in order to run some multilayer shallow water model. Handle a domain decomposition with MPI and provide basic stencil operations to operate over such grids. These basic operations should also be extremely fast to compute, possibly multithreading or to GPU. All of this should be quite standard, so here’s where I want to challenge you all: I would like for stencil operations to be nested: take as an example advection of velocity u along x, this is usually computed as u_i ( (u_{i} - u_{i-1})/dx + (u_{i+1} - u_{i})/dx ) /2, that is a combination of average and differentiation. I would like to write a function x_avg
and a function x_diff
such that the previous operation could be as easy as u * x_avg(x_diff(u))
. I don’t have a precise idea on how to do that or even if it is possible, but after discretizing an anisotropic diffusion operator this is my wildest dream.
Below you can see what I did in NEMO, is,js,ie,je
are indexes to the first and last points in the interior of the domain. I use the suffix int,ext
to address the interior and exterior fields that are exchanged (i.e. the interior field of some domain goes into the exterior part of the neighboring domain), for left, right, bottom and top. I can draw a sketch if that is not clear.
SUBROUTINE MPI_collect_r2l( kt, field_in, ni, nj, field_out, is, js, ie, je, hold)
!!----------------------------------------------------------------------
!! *** ROUTINE MPI_collect_r2l ***
!!
!! ** Purpose : Collect information from RIGHT to LEFT (r2l)
!!
!! ** Method :
!!
!! ** Action :
!!
!!-----------------------------------------------------------------------
INTEGER(i4), INTENT(in ) :: kt
INTEGER(i4), INTENT(in ) :: ni, nj
REAL(wp), DIMENSION( ni, nj), INTENT(in ) :: field_in
LOGICAL, INTENT(in ) :: hold
INTEGER(i4), INTENT(in ) :: is
INTEGER(i4), INTENT(in ) :: js
INTEGER(i4), INTENT(in ) :: ie
INTEGER(i4), INTENT(in ) :: je
REAL(wp), DIMENSION( ie-is+1, je-js+1), INTENT( out) :: field_out
REAL(wp), DIMENSION( ie-is+1, je-js+1) :: lint, rext
!
! MPI variables
INTEGER(i4) :: ierror, status, tag, my_rank
!
lint = 0._wp
rext = 0._wp
field_out = 0._wp
!
! Copy array into contiguous memory space to avoid MPI problems
!
lint = field_in( is : ie, js : je)
tag = 65464
!
! Get my rank and do the corresponding job
CALL MPI_Comm_rank(MPI_COMM_WORLD, my_rank, ierror)
!
! East to West
!
IF ( MOD(my_rank, jpni) .ne. 0 ) CALL MPI_send(lint, size(lint), MPI_double_precision, my_rank-1, tag, MPI_COMM_WORLD, ierror)
IF ( MOD(my_rank, jpni) .ne. jpni-1 ) CALL MPI_Recv(rext, size(rext), MPI_double_precision, my_rank+1, MPI_ANY_TAG, MPI_COMM_WORLD, status, ierror)
!
! Safely wait
!
IF ( hold ) call MPI_BARRIER(MPI_COMM_WORLD, ierror)
field_out = rext
!
END SUBROUTINE MPI_collect_r2l
SUBROUTINE MPI_collect_l2r( kt, field_in, ni, nj, field_out, is, js, ie, je, hold)
!!----------------------------------------------------------------------
!! *** ROUTINE MPI_collect_l2r ***
!!
!! ** Purpose : Collect information from LEFT to RIGHT (l2r)
!!
!! ** Method :
!!
!! ** Action :
!!
!!-----------------------------------------------------------------------
INTEGER(i4), INTENT(in ) :: kt
INTEGER(i4), INTENT(in ) :: ni, nj
REAL(wp), DIMENSION( ni, nj), INTENT(in ) :: field_in
LOGICAL, INTENT(in ) :: hold
INTEGER(i4), INTENT(in ) :: is
INTEGER(i4), INTENT(in ) :: js
INTEGER(i4), INTENT(in ) :: ie
INTEGER(i4), INTENT(in ) :: je
REAL(wp), DIMENSION( ie-is+1, je-js+1), INTENT( out) :: field_out
REAL(wp), DIMENSION( ie-is+1, je-js+1) :: rint, lext
!
! MPI variables
INTEGER(i4) :: ierror, status, tag, my_rank
!
rint = 0._wp
lext = 0._wp
!
! Copy array into contiguous memory space to avoid MPI problems
!
rint = field_in( is : ie, js : je )
tag = 65464
!
! Get my rank and do the corresponding job
CALL MPI_Comm_rank(MPI_COMM_WORLD, my_rank, ierror)
!
! West to East
!
IF ( MOD(my_rank, jpni) .ne. jpni-1 ) CALL MPI_send(rint, size(rint), MPI_double_precision, my_rank+1, tag, MPI_COMM_WORLD, ierror)
IF ( MOD(my_rank, jpni) .ne. 0 ) CALL MPI_Recv(lext, size(lext), MPI_double_precision, my_rank-1, MPI_ANY_TAG, MPI_COMM_WORLD, status, ierror)
!
! Safely wait
!
IF ( hold ) call MPI_BARRIER(MPI_COMM_WORLD, ierror)
field_out = lext
!
END SUBROUTINE MPI_collect_l2r
SUBROUTINE MPI_collect_t2b( kt, field_in, ni, nj, field_out, is, js, ie, je, hold)
!!----------------------------------------------------------------------
!! *** ROUTINE MPI_collect_t2b ***
!!
!! ** Purpose : Collect information from TOP to BOTTOM (t2b)
!!
!! ** Method :
!!
!! ** Action :
!!
!!-----------------------------------------------------------------------
INTEGER(i4), INTENT(in ) :: kt
INTEGER(i4), INTENT(in ) :: ni, nj
REAL(wp), DIMENSION( ni, nj), INTENT(in ) :: field_in
LOGICAL, INTENT(in ) :: hold
INTEGER(i4), INTENT(in ) :: is
INTEGER(i4), INTENT(in ) :: js
INTEGER(i4), INTENT(in ) :: ie
INTEGER(i4), INTENT(in ) :: je
REAL(wp), DIMENSION( ie-is+1, je-js+1), INTENT( out) :: field_out
REAL(wp), DIMENSION( ie-is+1, je-js+1) :: bint, text
!
! MPI variables
INTEGER :: ierror, status, tag, my_rank
!
bint = 0._wp
text = 0._wp
!
! Copy array into contiguous memory space to avoid MPI problems
!
bint = field_in( is : ie, js : je)
tag = 65464
!
! Get my rank and do the corresponding job
CALL MPI_Comm_rank(MPI_COMM_WORLD, my_rank, ierror)
!
! North to South
!
IF ( INT(my_rank/jpni) .ne. 0 ) CALL MPI_send(bint, size(bint), MPI_double_precision, my_rank-jpni, tag, MPI_COMM_WORLD, ierror)
IF ( INT(my_rank/jpni) +1 .ne. jpnj ) CALL MPI_Recv(text, size(text), MPI_double_precision, my_rank+jpni, MPI_ANY_TAG, MPI_COMM_WORLD, status, ierror)
!
! Safely wait
!
IF ( hold ) call MPI_BARRIER(MPI_COMM_WORLD, ierror)
field_out = text
!
END SUBROUTINE MPI_collect_t2b
SUBROUTINE MPI_collect_b2t( kt, field_in, ni, nj, field_out, is, js, ie, je, hold)
!!----------------------------------------------------------------------
!! *** ROUTINE MPI_collect_b2t ***
!!
!! ** Purpose : Collect information from BOTTOM to TOP (b2t)
!!
!! ** Method :
!!
!! ** Action :
!!
!!-----------------------------------------------------------------------
INTEGER(i4), INTENT(in ) :: kt
INTEGER(i4), INTENT(in ) :: ni, nj
REAL(wp), DIMENSION( ni, nj), INTENT(in ) :: field_in
LOGICAL, INTENT(in ) :: hold
INTEGER(i4), INTENT(in ) :: is
INTEGER(i4), INTENT(in ) :: js
INTEGER(i4), INTENT(in ) :: ie
INTEGER(i4), INTENT(in ) :: je
REAL(wp), DIMENSION( ie-is+1, je-js+1), INTENT( out) :: field_out
REAL(wp), DIMENSION( ie-is+1, je-js+1) :: tint, bext
!
! MPI variables
INTEGER :: ierror, status, tag, my_rank
!
tint = 0._wp
bext = 0._wp
!
! Copy array into contiguous memory space to avoid MPI problems
!
tint = field_in( is : ie, js : je)
tag = 65464
!
! Get my rank and do the corresponding job
CALL MPI_Comm_rank(MPI_COMM_WORLD, my_rank, ierror)
!
! South to North
!
IF ( INT(my_rank/jpni) +1 .ne. jpnj ) CALL MPI_send(tint, size(tint), MPI_double_precision, my_rank+jpni, tag, MPI_COMM_WORLD, ierror);
IF ( INT(my_rank/jpni) .ne. 0 ) CALL MPI_Recv(bext, size(bext), MPI_double_precision, my_rank-jpni, MPI_ANY_TAG, MPI_COMM_WORLD, status, ierror)
!
! Safely wait
!
IF ( hold ) call MPI_BARRIER(MPI_COMM_WORLD, ierror)
field_out = bext
!
END SUBROUTINE MPI_collect_b2t