Spectral derivative in 2-d

I’d like to use the MKL FFT routines to apply the Laplace operator on a 2-d periodic function sampled on a N \times N grid. I’ve figured out how to calculate the forward and backward transforms, but I can’t seem to figure out the correct scaling factors of the transformed multi-dimensional complex conjugate-even array:

! dfti.f90 -
!     Apply Laplace operator to a periodic function in 2-d using DFT
!
! Compile using:
!   ifort -warn all -qmkl dfti.f90 -o dfti
!
include "mkl_dfti.f90"
program dfti

  use mkl_dfti
  implicit none

  integer, parameter :: dp = kind(1.0d0)

  type(dfti_descriptor), pointer :: dh

  integer, parameter :: n = 16
  real(dp), target :: x(n,n), xpp(n,n)
  real(dp), pointer :: x1d(:) => null()

  integer, parameter :: ny = int(n/2.0) + 1
  complex(dp), target :: y(ny,n)
  complex(dp), pointer :: y1d(:) => null()

  real(dp), parameter :: pi = 4.0_dp*atan(1.0_dp)
  real(dp) :: sigma

  integer :: xdim(2), i, j, rstrides(3), cstrides(3), istat

  call init_x(n,x,xpp)

  open(unit=10,file="field_in.txt")
  do i = 1, n
    do j = 1, n
      write(10,*) (i-1)/real(n,dp), (j-1)/real(n,dp), x(i,j)
    end do
  end do
  close(10)

  x1d(1:n*n) => x
  y1d(1:ny*n) => y

  xdim = [n, n]
  rstrides = [0,1,n]
  cstrides = [0,1,ny]

  istat = DftiCreateDescriptor(dh,DFTI_DOUBLE,DFTI_REAL,2,xdim)
  call check_dfti_error(istat)

  istat = DftiSetValue(dh, DFTI_CONJUGATE_EVEN_STORAGE, DFTI_COMPLEX_COMPLEX)
  istat = DftiSetValue(dh, DFTI_PLACEMENT,DFTI_NOT_INPLACE)
  istat = DftiSetValue(dh, DFTI_INPUT_STRIDES, rstrides)
  istat = DftiSetValue(dh, DFTI_OUTPUT_STRIDES, cstrides)

  istat = DftiCommitDescriptor(dh)
  call check_dfti_error(istat)

  ! Perform a real to complex conjugate-even transform
  istat = DftiComputeForward(dh,x1d,y1d)
  call check_dfti_error(istat)

!  Calculate second derivative coefficients
!  do j = 1, n
!    do i = 1, ny
!      y(i,j) = -i**2 * y(i,j)
!    end do
!  end do

  istat = DftiSetValue(dh, DFTI_INPUT_STRIDES, cstrides)
  istat = DftiSetValue(dh, DFTI_OUTPUT_STRIDES, rstrides)

  sigma = 1.0_dp/n**2
  istat = DftiSetValue(dh, DFTI_BACKWARD_SCALE, sigma)
  istat = DftiCommitDescriptor(dh)

  istat = DftiComputeBackward(dh,y1d,x1d)
  call check_dfti_error(istat)

  open(unit=10,file="field_out.txt")
  do i = 1, n
    do j = 1, n
      write(10,'(*(G0,1X))') (i-1)/real(n,dp), (j-1)/real(n,dp), x(i,j), xpp(i,j)
    end do
  end do
  close(10)

  istat = DftiFreeDescriptor(dh)
  call check_dfti_error(istat)

contains

  subroutine init_x(n,f,fpp)
    integer, intent(in) :: n
    real(dp), intent(out) :: f(n,n),fpp(n,n)

    real(dp), parameter :: pi = 4.0_dp*atan(1.0_dp)
    real(dp) :: h, tx, ty
    integer :: i, j

    h = 1.0_dp/real(n,dp)

    do j = 1, n
      do i = 1, n

        tx = (j-1)*h
        ty = (i-1)*h

        f(i,j) = sin(2*pi*tx)*cos(2*pi*ty)

        ! f''
        fpp(i,j) = -8.0_dp*pi**2 * sin(2*pi*tx) * cos(2*pi*ty)
      end do
    end do

  end subroutine

  subroutine check_dfti_error(status,warn_only)
    use, intrinsic :: iso_fortran_env, only: error_unit
    integer, intent(in) :: status
    logical, intent(in), optional :: warn_only  ! default is .false.
    
    character(len=DFTI_MAX_MESSAGE_LENGTH) :: errmsg
    logical :: warn_only_

    warn_only_ = .false.
    if (present(warn_only)) warn_only_ = warn_only

    if (status /= 0) then
      errmsg = DftiErrorMessage(status)
      write(error_unit,'(A)') errmsg
      if (.not. warn_only_) stop
    end if
  end subroutine

end program

I’ve found a few formulas in these documents:

The following slide from a series of lecture notes from the National Taiwan University (I lost the webpage, unfortunately, so I’m not fully certain about the provenance) explains the algorithm in 1-d:

(note the storage of complex values differs between FFT libraries)

An older mirror of the MKL manual, describes the format I’m using in my example above:

The CCE format stores the values of the first half of the output complex conjugate-even signal resulted from the forward FFT. Note that the one-dimensional signal stored in CCE format is one complex element longer. For multi-dimensional real transform, n1 * n2 * n3 * ... * nk the size of complex matrix in CCE format is (n1/2+1)* n2 * n3 * ... * nk for Fortran and n1 * n2 * ... * (nk/2+1) for C.

Does someone have experience with multi-dimensional DFTs or can recommend a good resource to understand them?

We use both complex-complex and real-complex (conjugate even storage) in our code. Both MKL and FFTW use the same layout. See the routines (c)zfftifc_fftw.f90 and (c)zfftifc_mkl.f90 for the (single-) double-precision interfaces to FFTW and MKL, respectively.

Here is our MKL interface routine for real to complex transforms:

subroutine rzfftifc(nd,n,sgn,r,z)
use mkl_dfti
implicit none
! arguments
integer, intent(in) :: nd,n(nd),sgn
real(8), intent(inout) :: r(*)
complex(8), intent(inout) :: z(*)
! local variables
integer status,p,i
real(8) t1
type(DFTI_DESCRIPTOR), pointer :: handle
! automatic arrays
integer strides(0:nd)
p=product(n(:))
t1=1.d0/dble(p)
status=DftiCreateDescriptor(handle,DFTI_DOUBLE,DFTI_REAL,nd,n)
status=DftiSetValue(handle,DFTI_CONJUGATE_EVEN_STORAGE,DFTI_COMPLEX_COMPLEX)
status=DftiSetValue(handle,DFTI_PLACEMENT,DFTI_NOT_INPLACE)
status=DftiSetValue(handle,DFTI_FORWARD_SCALE,t1)
strides(0)=0
strides(1)=1
if (nd.gt.1) then
  strides(2)=n(1)/2+1
  do i=2,nd-1
    strides(i+1)=strides(i)*n(i)
  end do
end if
if (sgn.eq.-1) then
  status=DftiSetValue(handle,DFTI_OUTPUT_STRIDES,strides)
  status=DftiCommitDescriptor(handle)
  status=DftiComputeForward(handle,r,z)
else
  status=DftiSetValue(handle,DFTI_INPUT_STRIDES,strides)
  status=DftiCommitDescriptor(handle)
  status=DftiComputeBackward(handle,z,r)
end if
status=DftiFreeDescriptor(handle)
end subroutine

…and here is the FFTW equivalent:

subroutine rzfftifc(nd,n,sgn,r,z)
implicit none
! arguments
integer, intent(in) :: nd,n(nd),sgn
real(8), intent(inout) :: r(*)
complex(8), intent(inout) :: z(*)
! local variables
integer, parameter :: FFTW_ESTIMATE=64
integer p
integer(8) plan
real(8) t1
!$OMP CRITICAL(rzfftifc_)
if (sgn == -1) then
  call dfftw_plan_dft_r2c(plan,nd,n,r,z,FFTW_ESTIMATE)
else
  call dfftw_plan_dft_c2r(plan,nd,n,z,r,FFTW_ESTIMATE)
end if
!$OMP END CRITICAL(rzfftifc_)
call dfftw_execute(plan)
!$OMP CRITICAL(rzfftifc_)
call dfftw_destroy_plan(plan)
!$OMP END CRITICAL(rzfftifc_)
if (sgn == -1) then
  p=product(n(:))
  t1=1.d0/dble(p)
  p=p/n(1)
  p=p*(n(1)/2+1)
  call zdscal(p,t1,z,1)
end if
end subroutine

The following code sets up the Q-point grids (these are reciprocal to real-space grids):

! find next largest FFT-compatible Q-point grid size (radices 2, 3, 5 and 7)
call nfftifc(4,ngridq(1))
call nfftifc(4,ngridq(2))
call nfftifc(4,ngridq(3))
! total number of Q-points
nqpt=ngridq(1)*ngridq(2)*ngridq(3)
! number of complex FFT elements for real-complex transforms
n1=ngridq(1)/2+1
nfqrz=n1*ngridq(2)*ngridq(3)
! integer grid intervals for the Q-points
intq(1,:)=ngridq(:)/2-ngridq(:)+1
intq(2,:)=ngridq(:)/2

The routine nfftifc finds the nearest value of ngridq using the first 4 prime factors (2,3,5 and 7). Note that the number of complex even storage points is given by nfqrz which is about half the number of ngridq(1)*ngridq(2)*ngridq(3), and it’s the first index ngridq(1) that is halved.

The map from the Q-point given by integers (i1,i2,i3) to the grid compatible with MKL and FFTW is:

do iq=1,nqpt
  i1=ivq(1,iq); i2=ivq(2,iq); i3=ivq(3,iq)
  if (i1 >= 0) then
    j1=i1
  else
    j1=ngridq(1)+i1
  end if
  if (i2 >= 0) then
    j2=i2
  else
    j2=ngridq(2)+i2
  end if
  if (i3 >= 0) then
    j3=i3
  else
    j3=ngridq(3)+i3
  end if
  iqfft(iq)=j3*ngridq(2)*ngridq(1)+j2*ngridq(1)+j1+1
! map from q-point index to real-complex FFT index and vice versa
  if (i1 >= 0) then
    ifq=j3*ngridq(2)*n1+j2*n1+j1+1
    ifqrz(iq)=ifq
    iqrzf(ifq)=iq
  end if
end do

In the above, iqfft is a map from (i1,i2,i3) for the complex-complex transform. The map ifqrz is from (i1,i2,i3) to the even storage index and iqrzf is the inverse map.

Hope this helps.

I have an FFTW-based solution in DAMASK/src/grid/spectral_utilities.f90 at master · eisenforschung/DAMASK · GitHub. It’s for 3D and uses MPI.

A few comments

  • I use in-place transforms with pointers to the memory allocated in C. That is what FFTW recommends.
  • There is code for scalar, 3-vector and 3x3-tensor fields
  • Forward is real-to complex, backwards is complex-to-real to save memory
  • There are different differential operators (called filters). The standard one is continuous and uses unmodified frequencies.
  • The self test at the end is helpful to see if the data layout is correct.
  • MPI parallelization is along the third direction.
  • The last transpose is not carried out, so data layout in Fourier space is transposed. This saves communication.

Regarding literature: Most of the things I know are from the FFTW manual, e.g.

I think that the reason for plus one in ny = int(n/2.0) + 1 (also similar in FFTW as far as I know) is a technical point. Otherwise, N real space coefficient should have N Fourier space coefficient. Therefore, the inner loop should be up to n/2, not ny. Also, in the analytical solution, there seems to be wrong by a factor of 2. The derivative is with respect to x, therefore 2*pi should be multiplied for each differentiation.

  !Calculate second derivative coefficients
  do j = 1, n
    do i = 1, ny-1
      y(i,j) = -((i-1)*2.0_dp*pi)**2 * y(i,j)
    end do
  end do

Ooops, sorry about the misleading comment. The analytical solution xpp is meant to be \Delta f =\nabla^2 f = \partial_{xx} f + \partial_{yy} f.

For the function f = \sin{(2 \pi x)} \cos{(2 \pi y)}, this means \Delta f = -8\pi^2 \sin{(2 \pi x)} \cos{(2 \pi y)}.

@ivanpribec

You might find the pseudo-code algorithms in David Kopriva’s book “Implementing Spectral Methods for Partial Differential Equations” informative. see

Its my go to book for almost everything related to implementing the various forms of spectral methods from Fourier interpolation through DG spectral element methods. The algorithms are very easy to translate into Fortran and are worth the price of the book by themselves.

Edit

The following will take you to the supplemental material for Kopriva’s book. The zip file contains a PDF that is a corrected version of the chapter that covers “Algorithms for Periodic Functions” and discusses computing the Fourier function derivatives

https://extras.springer.com/?query=978-90-481-2260-8

That is an odd looking expression. It is equivalent to

ny = int( real(n) / 2.0 ) + 1

I’m not seeing why the conversion to real and then converting back to integer is necessary.

@RonShepard
This was not my suggestion, but a copy/paste from @ivanpribec code.

Then, yes, the analytical derivative is correct. I suggest to use a function with multiple frequencies to make sure the spectral derivative is free from minor bug, e.g.
f(i,j) = sin(2.0_dp*cos(2*pi*tx))*cos(3.0_dp*sin(2*pi*ty))

For the laplacian, then the multiplication of coefficients should depend on the index j as well:

  !Calculate second derivative coefficients
  do j = 1, n
    do i = 1, ny-1
      jt=j
      if(j>n/2) jt=n-j+2
      y(i,j) = -((i-1)**2+(jt-1)**2)*(2.0_dp*pi)**2 * y(i,j)
    end do
  end do

Since the function is real, the Fourier coefficients have the following property:

1 Like

Thanks @alirezagh76 and others for your help and suggestions. The suggested loop nest works, with the caveat, it assumes the domain length is 1 in both directions.

That’s a good point, but I didn’t want to complicate my MWE with a more complex example.

What a lucky find with this particular chapter. The HORSE2D & HORSES3D solvers are based on the algorithms in the book by Daniel Kopriva. (In slovenian, my mother tongue, kopriva means nettle, but I doubt that’s the surname origin.)

@ivanpribec may wish to know that the surname -kopriva- is either a topographic name (like the English surname Nettleton) or a nickname for a person with a prickly temperament according to the Oxford Dictionary of Surnames by Hanks and Hodges.