Generic or Kind-Agnostic linear system solvers

First, thank you for your answer.

The way I am tackling the task here is first to set up a solver with intrinsics and from there build up more complicated stuff. The first step is of course change the intrinsics to blas/lapack, but at this point there is no need to throw away the previous implementation, that could be helpful as a gateway to explain the methods (a CG is basically 20lines of fortran, apart from declaration). Moreover, providing both intrinsic and blas/lapack implementation could be extremely useful then to understand how to use those tools.

So, how to make them not mutually exclusive at compile time, meaning not to have to choose either intrinsic or lapack, but rather having both of them compiled?

This is super interesting, are these called procedure pointers right? I will try looking into it, but are these “as dangerous” as regular pointers (which I am not familiar with, I still have to check them out in detail)?

Dear all, I’m here to propose to you the zero-th iteration of the iterative method suite. First, let me thank @hkvzjal, @ivanpribec, @loiseaujc and @FedericoPerini for their replies to my questions and their suggestions. Moreover, I want you to pose yourself the question “how much fypp is too much fypp?” and give me your feedback, as you will see I might have gone a bit too far. Of course, everything in this proposal is open to changes, from naming to code structure, according to your suggestions, so please do not refrain from criticism.

The zero-th iteration only contains a Preconditioned Conjugate Gradient, as I am mostly studying the structure of the library rather than the actual methods. Also, sorry for the wall of text.

API proposal

The interface wants to be simple but customizable. The simplest case, for the user, is just
  call stdlib_linalg_cg( A, rhs, x)

within a program. This call solves the linear system A\mathbf{x}=\mathbf{b} with a tolerance tol = 10^{-6}, a maximum number of 1000 iterations and does not print any statement out. Eventually, the call

  call stdlib_linalg_cg( A, rhs, x, info)

provides also information through the integer info about the successful run of the solver (info=0) or the unsuccessful run (info=-1). Building up from this, one can change the default options with the

  call set_cg_opts(itmax=15)
  call stdlib_linalg_cg( A, rhs, x, info) ! Runs maximum 15 iterations 

and modify these parameters for all the successive runs, as the container of the default options (cg_defopts) is a global, public variable. A call with the option restore=.true. restores the default values for the successive runs

  call set_cg_opts(restore=.true.)
  call stdlib_linalg_cg( A, rhs, x, info) ! Runs with original options

Clearly, one might want to have multiple set of options to run a CG under different conditions (e.g. to solve a linear system with good accuracy and to use it as a preconditioner for a Rayleigh Quotient minimization, with low accuracy or small maximum iteration number). This can be achieved by defining an option type

  !! Option type
  type(cg_options)    :: cg_solver, cg_precond
  […]
  call set_cg_opts(cg_solver, tol=10e-10)
  call set_cg_opts(cg_precond, tol=10e-4)
  […]
  call stdlib_linalg_cg( A, rhs, x, cg_solver)
  call stdlib_linalg_cg( A, rhs, x, cg_precond)

Finally, one might need to specify an initial guess for the solver. This can be done with

  call stdlib_linalg_cg( A, rhs, x, xzero = .true. ) 

which tells the solver to compute the initial residual as \mathbf{r}=\mathbf{b} - A\mathbf{x}, where \mathbf{x} is the actual passed argument of the call. Clearly,

  call stdlib_linalg_cg( A, rhs, x, xzero = .false. ) 

computes the initial residual as \mathbf{r}=\mathbf{b} and allows bookkeeping of what one is doing if needed. Moving on to the next big one.

Preconditioning
IMHO, there is no way of doing versatile preconditioning without forcing the user to use an abstract type. The preconditioner type is defined as

  !!
  !! real(wp) preconditioner
  !!
  type, abstract :: wp_preconditioner
      !! Abstract type defining the empty shell.
    contains
      !! Procedure for the application of the preconditioner to a vector.
      procedure(wp_apply), deferred :: apply
      !!
  end type wp_preconditioner
  
  abstract interface
     !! Abstract procedure defining the interface for a
     !! general application action
     subroutine wp_apply(this,vec_in,vec_out,ierr)
       import wp
       import wp_preconditioner
       implicit none
       !! The class with all quantities describing the operator
       class(wp_preconditioner), intent(in   ) :: this
       !! The input" vector 
       real(wp),   intent(in   ) :: vec_in(:)
       !! The output vector y= "operator apllied on" x
       real(wp),   intent(inout) :: vec_out(:)
       !! An integer flag to inform about error
       !! It is zero if no error occured
       integer,            intent(inout) :: ierr
     end subroutine wp_apply
     !!
     !!
  end interface

Please, interpret wp as any kind of precision is needed, this will be fypped later. This is unfortunately needed because if we want to defer apply we need to specify the interface and thus the kind of the input/output arrays. As a trivial example, one could define a Jacobi preconditioner

module Jacobi_prec
  use stdlib_kinds
  use stdlib_linalg_preconditioners
  private
  !! Extension of dp_preconditioner to define a Jacobi one
  type, public, extends(dp_preconditioner) :: dp_Jacobi
      !
      real(dp), allocatable :: Diag(:)
      integer               :: nrows
      !
    contains
      !
      procedure, public, pass :: init => init_Prec
      !! initialize procedure
      procedure, public, pass :: apply =>  apply_Prec
      !! apply procedure, overridden 
      !
  end type dp_Jacobi
  !!
contains
  !! constructor
  subroutine init_Prec(this, nrows, matIN)
    implicit none
    class(dp_Jacobi), intent(inout) :: this
    !
    integer,                intent(in   ) :: nrows
    real(dp),               intent(in   ) :: matIN(:,:)
    integer                               :: i
    !
    this%nrows = nrows
    allocate(this%Diag(nrows))
    do i = 1, this%nrows
      this%Diag(i) = matIN(i,i)
    end do
  end subroutine init_Prec
  !!
  !! apply tridiagonal operator
  subroutine apply_Prec(this,vec_in,vec_out,ierr)
    implicit none
    class(dp_Jacobi), intent(in   ) :: this
    real(dp),               intent(in   ) :: vec_in(:)
    real(dp),               intent(inout) :: vec_out(:)
    integer,                intent(inout) :: ierr
    integer                               :: i
    !
    do i = 1, this%nrows
      vec_out(i) = vec_in(i) / this%Diag(i)
    end do
    ierr = 0
  end subroutine apply_Prec
  !! 
end module Jacobi_prec

The interesting feature of the “empty shell” approach is that it only depends on the kind of vector we apply the preconditioner on to, through the apply routine. This means that with the same shell we already cover intrinsic matrices and the sparse matrix class under development. Any extension of this template to other vectors (e.g. abstract vectors, as in Enrico Facca’s proposal) is a matter of fypp (and actual implementation, clearly). Notice that the init routine cannot be made abstract because it will be impossible to specify a priori the input, so the user should take its time to build his preconditioner (which is not stupid) while some basic preconditioners could be already provided, e.g. Jacobi, SSOR, ICHOL and ILU, as a further expansion of the library.
The user developed method, or the library provided method will be accessible in the code with

  type(dp_Jacobi)     :: Jacobi
  […]
  call Jacobi%init(n, A)
  […]
  call stdlib_linalg_cg( A = A, K = Jacobi, rhs = rhs, x = x, info = info ) 

where K is the preconditioner.

What’s under the hood

Let’s start from the options. The `cg_options` type is defined as
  !! Definition: cg_options
  type, public :: cg_options
      !!
      !! Maximum number of iterations 
      integer :: itmax = 1000
      !!
      !! Single and double precision tolerance
      real(sp) :: tol_sp = 0.0000001_sp
      real(dp) :: tol_dp = 0.0000001_dp
      !!
      !! Verbosity and writing unit 
      logical  :: verb = .false.
      integer  :: out_unit = 6
      !!
  end type

And within the module there is the set_cg_options subroutine that I do not post here because it’s just an awful set of conditionals. Its interface reads

  subroutine set_cg_opts(opt, itmax, tol_sp, tol_dp,  verb, out_unit, restore )
    !!
    type(cg_options), intent(inout), optional :: opt
    integer,  intent(in   ), optional :: itmax
    real(sp), intent(in   ), optional :: tol_sp
    real(dp), intent(in   ), optional :: tol_dp
    logical,  intent(in   ), optional :: verb
    integer,  intent(in   ), optional :: out_unit
    logical,  intent(in   ), optional :: restore

Here I have a request for help: It is clear that these set of values are common to all solvers, but some (e.g. restarted GMRES) have more options (say a restart number nrest). It would be natural then to define a base class kr_options and extend it when designing a method. However, I don’t understand if and how inheritance would work for methods of the class, if I can extend a method or not. Ironically, the design of the options container is what keeps me up at night and I can’t figure it out.

The actual solver

Here comes the fun. This is the implemented algorithm, with its interface and all, for the case of intrinsic data types. What I’m looking for here is some suggestion about code structuring and eventual checks to perform in the subroutine (not code check or algorithmic correctness check).
  subroutine stdlib_krylov_intrinsic_dp_CG( A, rhs, x, info, xzero, opts_in )
    !!
    !! In-Out variables
    real(dp),  intent(in   ) :: A(:,:) 
    !! Linear operator to be inverted.
    real(dp),  intent(in   ) :: rhs(:)
    !! Right-hand side vector.
    real(dp),  intent(inout) :: x(:)
    !! Solution vector.
    integer,   intent(  out) :: info
    !! Information flag.
    logical,   intent(in   ), optional :: xzero
    !! provided initial solution flag  
    type(cg_options),  intent(in   ), optional :: opts_in
    !! Option structure.
    !!
    !! Internal variables
    type(cg_options) :: options
    !! Default options
    real(dp), allocatable :: r(:)
    real(dp), allocatable :: p(:)
    real(dp), allocatable :: Ap(:)
    !! Residual, search direction and other vectors.
    real(dp) :: alpha, beta
    real(dp) :: r_dot_r_old, r_dot_r_new
    !! Scalars used in the CG algorithm.
    real(dp ) :: residual
    !! Residual norm for exit criterions
    integer :: itnum
    !! Iteration counter   
    integer :: n
    !! Problem size
    n = size( A, 1 )
    info = 0

    if (present(opts_in)) then
       options = opts_in
    else
       options = default_opts
    end if    
    !! Options to be set
    !!
    ! Allocate and initialize support vector
    allocate (Ap, mold = rhs)
    Ap = 0._dp

    ! Allocate residual and copy in rhs
    ! ( 2) Evaluate initial residual r_o = rhs ( - A*x_o later)
    allocate (r, source = rhs)

    ! ( 1) Parse whether an initial solution is provided or not
    if ( present(xzero) .and. xzero ) then 
      ! It is provided
      ! ( 1) Keep x_o
      ! ( 2) Evaluate initial residual r_o = rhs - A*x_o
      Ap = matmul( A, x )
      r = r - Ap
    else
      ! It is not provided
      ! ( 1) Set x to zero
      x = 0._dp
      ! ( 2) r = b already performed in allocation
    end if

    ! Allocate direction vector and copy-in p = r.
    allocate (p, source = r  )

    ! Initialize dot product of residual: r_dot_r_old = r' * r.
    r_dot_r_old = dot_product( r, r )
    
    ! Safety computation to avoid a possible division by zero in r*r/p*Ap
    if ( sqrt(r_dot_r_old) == 0._dp ) p = options%tol_dp

    ! info flag 
    if ( present(info) ) info = -1

    ! CG Iteration Loop.
    cg_iterations: do itnum = 1, options%itmax

      ! ( 4) Compute A * p.
      Ap = matmul( A, p )
      ! ( 4) Compute step size alpha = r_dot_r_old / (p' * Ap).
      alpha = r_dot_r_old / dot_product( p, Ap )

      ! ( 5) Update solution x = x + alpha * p.
      x = x + (alpha) * p 

      ! ( 6) Update residual r = r - alpha * Ap.
      r = r + (-alpha) * Ap 

      ! ( 7) Compute new dot product of residual r_dot_r_new = r' * r.
      r_dot_r_new = dot_product( r, r ) 
      ! ( 7) Compute norm of residual.
      residual = sqrt(r_dot_r_new)

      ! ( 7) Check for convergence
      if (options%verb) then
         write (options%out_unit, *) "INFO : CG residual after ", (itnum), "iterations : ", residual
      end if

      if ( residual < options%tol_dp ) then
        if (options%verb) then
          write (options%out_unit, *) "INFO : CG Converged: residual ", residual, "< tolerance: ", options%tol_dp
        end if
        if ( present(info) ) info = 0
        exit cg_iterations
      end if

      ! ( 8) Compute new direction beta = r_dot_r_new / r_dot_r_old.
      beta = r_dot_r_new/r_dot_r_old

      ! ( 9) Update direction p = r + beta * p.
      p = beta * p + 1._dp * r 

      ! (10) Update r_dot_r_old for next iteration.
      r_dot_r_old = r_dot_r_new

    end do cg_iterations

    deallocate (r, p, Ap)

    return

  end subroutine stdlib_krylov_intrinsic_dp_CG

You might wonder: what kind of data types are supported? If we look at the complete declaration of variables of this routine, we have a matrix, a vector, a scalar, and implicitly a precision. These are the things that must be prescribed if we want to extend this algorithm to some other data type. So I propose the following approach: define the algorithm for tabular, intrinsic arrays, define it for the forthcoming sparse class, and define it for abstract Operators-Vectors-Scalars (I sense that abstract scalars will be handy defining parallel reductions to define parameters such as alpha which is a ratio of scalar products), which I will start developing and testing soon. With these three macro-kinds, we should be able to target a broad class of usages. In particular, what the library currently handles are the following types.

sp dp
Matrix Vector Scalar
real(sp) real(sp) real(sp)
COO_sp real(sp) real(sp)
CSR_sp real(sp) real(sp)
CSC_sp real(sp) real(sp)
ELL_sp real(sp) real(sp)
sp_Operator real(sp) real(sp)
sp_Operator sp_Vector real(sp)
Matrix Vector Scalar
real(dp) real(dp) real(dp)
COO_dp real(dp) real(dp)
CSR_dp real(dp) real(dp)
CSC_dp real(dp) real(dp)
ELL_dp real(dp) real(dp)
dp_Operator real(dp) real(dp)
dp_Operator dp_Vector real(dp)

Code the algorithm once, scatter on all data types

Ok, but how do we code it for all these data types without losing our minds? Enters fypp. The actual implemented routine is the result of the early days of the developing, when I coded all the Conjugate "Things" kind of algorithm, just to find errors and to correct them multiple times, on multiple data structures and so on. So I came out with the following:
  !!
  !!
  !! Conjugate Gradient Algorithm implementation:
  !!
  #:for tp in type_dict.values()
  #:for i in range(len(tp["subnam"]))
  !! ${tp["matrix"][i]}$-${tp["vector"][i]}$ classical version
  !!
  subroutine stdlib_krylov_${tp["subnam"][i]}$_CG( A, rhs, x, info, xzero, opts_in )
    !!
    !! In-Out variables
    ${tp["matrix"][i]}$,  intent(in   ) :: A${tp["matrnk"][0]}$ 
    !! Linear operator to be inverted.
    ${tp["vector"][i]}$,  intent(in   ) :: rhs${tp["vecrnk"][0]}$
    !! Right-hand side vector.
    ${tp["vector"][i]}$,  intent(inout) :: x${tp["vecrnk"][0]}$
    !! Solution vector.
    integer,   intent(  out) :: info
    !! Information flag.
    logical,   intent(in   ), optional :: xzero
    !! provided initial solution flag  
    type(cg_options),  intent(in   ), optional :: opts_in
    !! Option structure.
    !!
    !! Internal variables
    type(cg_options) :: options
    !! Default options
    ${tp["vector"][i]}$, allocatable :: r${tp["vecrnk"][0]}$
    ${tp["vector"][i]}$, allocatable :: p${tp["vecrnk"][0]}$
    ${tp["vector"][i]}$, allocatable :: Ap${tp["vecrnk"][0]}$
    !! Residual, search direction and other vectors.
    ${tp["scalar"][i]}$ :: alpha, beta
    ${tp["scalar"][i]}$ :: r_dot_r_old, r_dot_r_new
    !! Scalars used in the CG algorithm.
    real(${tp["wp"][i]}$ ) :: residual
    !! Residual norm for exit criterions
    integer :: itnum
    !! Iteration counter   
    integer :: n
    !! Problem size
    n = ${ tp["rowsize"]("A") }$
    info = 0

    if (present(opts_in)) then
       options = opts_in
    else
       options = default_opts
    end if    
    !! Options to be set
    !!
    ! Allocate and initialize support vector
    allocate (Ap, mold = rhs)
    ${ tp["coef"]("Ap") }$ = 0._${tp["wp"][i]}$

    ! Allocate residual and copy in rhs
    ! ( 2) Evaluate initial residual r_o = rhs ( - A*x_o later)
    allocate (r, source = rhs)

    ! ( 1) Parse whether an initial solution is provided or not
    if ( present(xzero) .and. xzero ) then 
      ! It is provided
      ! ( 1) Keep x_o
      ! ( 2) Evaluate initial residual r_o = rhs - A*x_o
      ${ tp["matvec"]("A", "x", "Ap", tp["wp"][i] ) }$
      ${ tp["coef"]("r") }$ = ${ tp["coef"]("r") }$ - ${ tp["coef"]("Ap") }$
    else
      ! It is not provided
      ! ( 1) Set x to zero
      ${ tp["coef"]("x") }$ = 0._${tp["wp"][i]}$
      ! ( 2) r = b already performed in allocation
    end if

    ! Allocate direction vector and copy-in p = r.
    allocate (p, source = r  )

    ! Initialize dot product of residual: r_dot_r_old = r' * r.
    r_dot_r_old = ${ tp["dotprod"]( "r", "r", "n") }$
    
    ! Safety computation to avoid a possible division by zero in r*r/p*Ap
    if ( sqrt(r_dot_r_old) == 0._${tp["wp"][i]}$ ) ${ tp["coef"]("p") }$ = options%tol_${tp["wp"][i]}$

    ! CG Iteration Loop.
    cg_iterations: do itnum = 1, options%itmax

      ! ( 4) Compute A * p.
      ${ tp["matvec"]("A", "p", "Ap", tp["wp"][i] ) }$
      ! ( 4) Compute step size alpha = r_dot_r_old / (p' * Ap).
      alpha = r_dot_r_old / ${ tp["dotprod"]( "p", "Ap", "n") }$

      ! ( 5) Update solution x = x + alpha * p.
      ${ tp["axpy"]("alpha", "p", "x", tp["wp"][i], "n" ) }$

      ! ( 6) Update residual r = r - alpha * Ap.
      ${ tp["axpy"]("-alpha", "Ap", "r", tp["wp"][i], "n" ) }$

      ! ( 7) Compute new dot product of residual r_dot_r_new = r' * r.
      r_dot_r_new = ${ tp["dotprod"]( "r", "r", "n") }$ 
      ! ( 7) Compute norm of residual.
      residual = sqrt(r_dot_r_new)

      ! ( 7) Check for convergence
      if (options%verb) then
         write (options%out_unit, *) "INFO : CG residual after ", (itnum), "iterations : ", residual
      end if

      if ( residual < options%tol_${tp["wp"][i]}$ ) then
        if (options%verb) then
          write (options%out_unit, *) "INFO : CG Converged: residual ", residual, "< tolerance: ", options%tol_${tp["wp"][i]}$
        end if
        exit cg_iterations
      end if

      ! ( 8) Compute new direction beta = r_dot_r_new / r_dot_r_old.
      beta = r_dot_r_new/r_dot_r_old

      ! ( 9) Update direction p = r + beta * p.
      ${ tp["axpby"]("1._"+tp["wp"][i], "r", "beta", "p", tp["wp"][i], "n" ) }$

      ! (10) Update r_dot_r_old for next iteration.
      r_dot_r_old = r_dot_r_new

    end do cg_iterations

    deallocate (r, p, Ap)

    return

  end subroutine stdlib_krylov_${tp["subnam"][i]}$_CG
  !!
  #:endfor
  #:endfor

Here’s the trick, first define an empty dictionary in fypp, then define a second dictionary specific to the implementation you want, and add it to the dictionary. If the dictionaries are all constructed with the same structure, everything works fine.

#:set wp_kinds = ["sp", "dp"]
#:set type_dict = {}
#:include "operations_intrinsics.fypp"
#:set type_dict = { "intrinsics": intrinsics_dict }

But what’s inside “operations_intrinsics.fypp”? The operations you need to perform in a generic iterative method written with your specific data type.

#:mute
#!
#! Inquire operations
#! 
#:def intrinsics_rowsize(A)
size( ${A}$, 1 )
#:enddef

#:def intrinsics_colsize(A)
size( ${A}$, 2 )
#:enddef

#!
#! Access operations
#! 
#:def intrinsics_coef(vec)
${vec}$
#:enddef

#!
#! Vector-Vector operations
#! 
#:def intrinsics_dotprod( vec1, vec2 , n)
dot_product( ${vec1}$, ${vec2}$ )
#:enddef

#:def intrinsics_axpy( alpha, x, y, wp, n )
${y}$ = ${y}$ + (${alpha}$) * ${x}$ 
#:enddef

#:def intrinsics_axpby( alpha, x, beta, y, wp, n )
${y}$ = ${beta}$ * ${y}$ + ${alpha}$ * ${x}$ 
#:enddef

#!
#! Matrix-Vector operations
#! 
#:def intrinsics_matvec(A, vec_in, vec_out, wp )
${vec_out}$ = matmul( ${A}$, ${vec_in}$ )
#:enddef

#:def intrinsics_matAxpy(alpha, vec_in, vec_out, wp, n )
${vec_out}$ = ${alpha}$ * matmul( ${A}$, ${vec_in}$) + ${vec_out}$ 
#:enddef

#:def intrinsics_matAxpby(A, x, beta, y, wp, n )
${vec_out}$ = ${alpha}$ * matmul( ${A}$, ${vec_in}$) + beta * ${vec_out}$ 
#:enddef

#!
#! Dictionary definition
#! 
#!                        | All these must be coherent    |
#:set intrinsics_dict = { &
            &   "subnam": [ "intrinsic_dp"],    &
            &   "matrix": [ "real(dp)"],        &
            &   "vector": [ "real(dp)"],        &
            &   "scalar": [ "real(dp)"],        &
            &   "wp":     [ "dp"],              &
            &   "matrnk": [ "(:,:)"],           &
            &   "vecrnk": [ "(:)"],             &
            &   "rowsize": intrinsics_rowsize,  &
            &   "colsize": intrinsics_colsize,  &
            &   "dotprod": intrinsics_dotprod,  &
            &   "coef": intrinsics_coef,        &
            &   "axpy": intrinsics_axpy,        &
            &   "axpby": intrinsics_axpby,      &
            &   "matvec": intrinsics_matvec,    &
            &   "matAxpy": intrinsics_matAxpy,  &
            &   "matAxpby": intrinsics_matAxpby &
            & }

#:endmute

This particular example is not really useful, I agree, but it renders the idea. In particular, this allows you to write the solver and test its algorithmic correctness, add checks, documentation and whatever you want, regardless of the data type you are using. Then, you define another data type, say the sparse matrix set, define its dictionary, and include it into fypp. Voilà, you have your CG for sparse matrices as well. To give you an idea, the sparse dictionary looks like this

#:mute

#:set SPARSE_TYPES = ["COO_", "CSR_", "CSC_", "ELL_"]

#!
#! Inquire operations
#! 
#:def sparse_rowsize(A)
${A}$%nrows
#:enddef

#:def sparse_colsize(A)
${A}$%ncols
#:enddef

#!
#! Access operations
#! 
#:def sparse_coef(vec)
${vec}$
#:enddef

#!
#! Vector-Vector operations
#! 
#:def sparse_dotprod( vec1, vec2 , n)
dot( ${n}$, ${vec1}$, 1, ${vec2}$, 1 )
#:enddef

#:def sparse_axpy( alpha, x, y, wp, n )
call axpy( ${n}$, ${alpha}$, ${x}$, 1, ${y}$, 1  ) 
#:enddef

#:def sparse_axpby( alpha, x, beta, y, wp, n )
!  ${y}$ <- ${beta}$ * ${y}$
      !  ${y}$ <-  ${y}$ + ${alpha}$ *  ${x}$
      call scal( ${n}$, ${beta}$, ${y}$, 1 )
      call axpy( ${n}$, ${alpha}$, ${x}$, 1, ${y}$, 1  ) 
#:enddef

#!
#! Matrix-Vector operations
#! 
#:def sparse_matvec(A, vec_in, vec_out, wp )
! Computes ${vec_out}$ = ${A}$ * ${vec_in}$.
      ${vec_out}$ = 0._${wp}$
      call matvec( ${vec_out}$ , ${A}$ , ${vec_in}$ ) 
#:enddef

#:def sparse_matAxpy(alpha, vec_in, vec_out, wp, n )
      call matvec( ${vec_out}$ , ${A}$ , ${vec_in}$ ) 
#:enddef

#:def sparse_matAxpby(A, x, beta, y, wp, n )
      ${vec_out}$ = ${beta}$ * ${vec_out}$
      call matvec( ${vec_out}$ , ${A}$ , ${vec_in}$ ) 
#:enddef

#!
#! Dictionary definition
#! 
#!                                         |        All these must be coherent    |
#:set sparse_dict = { &
            &   "subnam": [x+y             for x in SPARSE_TYPES for y in wp_kinds], &
            &   "matrix": ["type("+x+y+")" for x in SPARSE_TYPES for y in wp_kinds], &
            &   "vector": ["real("+y+")"   for x in SPARSE_TYPES for y in wp_kinds], &
            &   "scalar": ["real("+y+")"   for x in SPARSE_TYPES for y in wp_kinds], &
            &   "wp":     [y               for x in SPARSE_TYPES for y in wp_kinds], &
            &   "matrnk": [" "],            &
            &   "vecrnk": ["(:)"],          &
            &   "rowsize": sparse_rowsize,  &
            &   "colsize": sparse_colsize,  &
            &   "dotprod": sparse_dotprod,  &
            &   "coef": sparse_coef,    &
            &   "axpy": sparse_axpy,        &
            &   "axpby": sparse_axpby,      &
            &   "matvec": sparse_matvec,    &
            &   "matAxpy": sparse_matAxpy,  &
            &   "matAxpby": sparse_matAxpby &
            & }

#:endmute

If you have been following the development of the sparse suite, you might notice that now the matvec is no more called matvec but spmv. Changing the name of the routine in the dictionary actually changes the call in the whole library depending on the dictionary (which is quite nice rather than going full “search and replace”). The only thing I dislike of this approach is the tedious syntax of python dictionaries. The take at home message is code only the standard method and its preconditioned version once, then let fypp do the magic. Right now, the whole code for PCG is 300 lines long. After fypping is 3000 lines long, but as long as you test the correctness of your dictionaries, all the replicas are algorithmically doing the same operations.

Conclusions

Let me know what you think, what should added, what could be done in a clever/better way. This is the ground structure of what it could be in the future `stdlib_krylov`.
3 Likes

@kimala ,

In the context of this thread with Generic solvers that may need to work with any KIND of data (especially floating-point, or the so-called REAL values in Fortran):

  • it will be great if you and all Fortranners develop a vision for a Fortran language that sees any “fypp is too much fypp” and push the standard committees to develop facilities that do not then require you to rely on preprocessing for Generics and
  • keep the use of object-oriented (OO) paradigm to.a certain bare minimum when it comes to achieving generic/templated code for solvers
3 Likes

@FortranFan ,

  • This is not only out of my expertise, but also out of my knowledge. I cannot provide any kind of valuable proposal to the standard committee, so I refrain from just stating wishlists and wait for them to happen. Moreover, the committee can come out tomorrow with a solution to all our problems, but only God knows how long it will take to have them in the compiler. So my approach is to take the tools I have and build something first, something easy to use, useful to use, and easy to maintain. I am not proposing an iterative solver suite because I get bored during my weekends and I need an hobby, I’m proposing stuff because I wish I could have this kind of stuff in the codes I actually work with, because I am tired of coding like it was the nineties, and that is my 9 to 5 job. So my philosophy is to build now, and build good. If generics come out tomorrow, I’ll be the first one looking into it and I will in happy to adapt the library with them.
  • I am no expert in OOP, so all my codes have just a tad of OOP. This is the reason the option container is keeping me up.