stdlib_LinAlgOperators: Linear Algebra Operators

Before starting, I have to thank @loiseaujc for the incredible support and for all the discussions we had. Moreover, this proposal is very close to his implementation of LightKrylov. I need to thank @ivanpribec, @FedericoPerini, @hkvzjal and @fluidnumerics_joe for the discussions. Also, this builds upon Enrico Facca’s PR, and it is being tested in the Krylov suite I’m working on (among other DDTs).

The following aims at laying the foundations for some abstract data type containing the notion of an operator and of the space(s) on which it acts. Nothing about this implementation is off the discussion (naming, operations to implement, implementations…), please share your comments, suggestions or doubts. Contributors are also welcome.

stdlib_LinAlgOperators: Linear Algebra Operators

From the Wikipedia page Operator:

In mathematics, an operator is generally a mapping or function that acts on elements of a space to produce elements of another space (possibly and sometimes required to be the same space).

We can write the operator \mathcal{A} as \mathcal{A} \colon U \rightarrow V where U is the domain (or set of inputs) and V is the codomain (or set of destination). Based on this definition, it is easy to envision something similar to be included in the Standard Library, for the Linear Algebra package.

Operators

To begin, lets consider Intrinsic Operators: those are the operators that take as an input a intrinsic array vec_in(:) and that provide a processed intrisic array vec_out(:). The current proposal is the same as Enrico Facca’s PR plus two extra deferred procedures and two extra default implementations.

The two extra procedures are get_domsize() and get_cdmsize() to get respectively the domain size and the codomain size: basically, the first returns the number of columns, the second one the number of rows. Notice that in some cases (like in distributed matrices/vectors, the local size of the operator does not coincide with the size of the domain and codomain, that’s why I’m proposing domsize and cdmsize and still have ncol and nrows defined somewhere else, so to have local and global.)

The two extra default operations are Axpy and Axpby, that can be defined easily through the deferred apply and some BLAS routines.

module stdlib_Operators
  use stdlib_kinds
  use stdlib_linalg_blas
  use stdlib_linalg_lapack
  !! 
  !! Module containing the abstract class parent for an
  !! operator derived type, for different precisions,
  !! uses AbstractVectors to match the precisions
  !!
  implicit none
  private
  !!
  !! Public types of operator
  !!
  public :: Operator_dp

  !!
  !! real(dp) abstract operator
  !!
  type, abstract :: Operator_dp
     !! Abstract type defining base general operator

     !! Members shared by any operator:     
     !! Number of columns
     integer :: ncol=0
     
     !! Number of rows 
     integer :: nrow=0

     !! Logical flag for symmetry
     logical :: is_symmetric=.false.

   contains
     !! Procedure for the application of the operator to a vector.
     procedure(application_dp),           deferred :: apply
     
     !! Procedure to inqire the domain size
     procedure(get_domSize_interface_dp), deferred :: get_domsize
     
     !! Procedure to inqire the domain size
     procedure(get_cdmSize_interface_dp), deferred :: get_cdmsize
     !
     !  Procedures defined upon apply
     !! 
     procedure, nopass :: axpby => Axpby_dp
     !!
     procedure, nopass ::  axpy => Axpy_dp
     !
  end type Operator_dp
  
  abstract interface
     !! Abstract procedure defining the interface for a general
     !! application actions
     !!
     !! y = M(x) = ( M*x if M is linear)
     subroutine application_dp(this,vec_in,vec_out,ierr)
       import dp
       import Operator_dp
       implicit none
       !! The class with all quantities describing the operator
       class(Operator_dp), intent(in   ) :: this
       !! The input" vector 
       real(dp),   intent(in   ) :: vec_in(:)
       !! The output vector y= "operator apllied on" x
       real(dp),   intent(inout) :: vec_out(:)
       !! An integer flag to inform about error
       !! It is zero if no error occured
       integer,               intent(inout) :: ierr
     end subroutine application_dp

     !! Abstract procedure defining the interface for a general
     !! get_domainSize procedure
     !!
     !! y = M(x) = ( M*x if M is linear)
     function get_domSize_interface_dp(this) result(size)
       import Operator_dp
       implicit none
       !! The class with all quantities describing the operator
       class(Operator_dp), intent(in   ) :: this
       !! The size of the domain
       integer                                  :: size
     end function get_domSize_interface_dp

     !! Abstract procedure defining the interface for a general
     !! get_codomainSize procedure
     !!
     !! y = M(x) = ( M*x if M is linear)
     function get_cdmSize_interface_dp(this) result(size)
       import Operator_dp
       implicit none
       !! The class with all quantities describing the operator
       class(Operator_dp), intent(in   ) :: this
       !! The size of the domain
       integer                                  :: size
     end function get_cdmSize_interface_dp

  end interface     
  
contains 

  !! y = M(x) = ( M*x if M is linear)
  subroutine Axpy_dp(this,x,y,ierr)
    implicit none
    !! The class with all quantities describing the operator
    class(Operator_dp), intent(in   ) :: this
    !! The input" vector 
    real(dp),   intent(in   ) :: x(:)
    !! The output vector y= "operator applied on" x
    real(dp),   intent(inout) :: y(:)
    !! An integer flag to inform about error
    !! It is zero if no error occured
    integer,               intent(inout) :: ierr
    !
    real(dp), allocatable :: vec_aux(:)

    allocate( vec_aux, mold=x )
    call this%apply(x, vec_aux, ierr)
    call axpy( size(y), 1._dp, vec_aux, 1, y, 1  ) 
  end subroutine Axpy_dp

  !! y = M(x) = ( M*x if M is linear)
  subroutine Axpby_dp(this,x,beta,y,ierr)
    implicit none
    !! The class with all quantities describing the operator
    class(Operator_dp), intent(in   ) :: this
    !! The input" vector 
    real(dp),   intent(in   ) :: x(:)
    real(dp),              intent(in   ) :: beta
    !! The output vector y= "operator applied on" x
    real(dp),   intent(inout) :: y(:)
    !! An integer flag to inform about error
    !! It is zero if no error occured
    integer,               intent(inout) :: ierr
    !
    real(dp), allocatable :: vec_aux(:)

    allocate( vec_aux, mold=x )
    call this%apply(x, vec_aux, ierr)
    call scal( size(y), beta, y, 1 )
    call axpy( size(y), 1._dp, vec_aux, 1, y, 1  ) 
  end subroutine Axpby_dp

end module stdlib_Operators

Why do we need an Operator acting on intrinsic arrays?

  • In SELF (if I remember correctly) the matrices are never explicitly computed but rather only their action is known. A standard way of making a container for such actions could be helpful.
  • On shared memory architectures, often you need to define the parallelization of the matrix-vector product, but the input and output vectors are completely in memory.
  • In the Krylov suite we will need preconditioners, and since the sparse suite acts on intrinsic arrays, so should the preconditioner. The preconditioner could be defined as an extension of the operator type or not. As for now, it is not.

Abstract Operators and Abstract Vectors

Abstract Operators

The Abstract operator is absolutely the same as the Intrinsic Operator, it just accepts abstract vectors as an input. Clearly, an abstract preconditioner is defined in the same manner as the intrinsic preconditioner, but dealing with abstract vectors.

Abstract Vectors

This module defines abstract vectors and bases of abstract vector. The fundamental procedures needed to be implemented by the user are init and kill, axpby and axpy, dot, getval and setval and rand. Upon these, a set of default implementations is provided.

About axpy: I know that it can be implemented easily upon axpby, but you would require an extra support vector to do the trick and to be fair, the two routines are extremely similar, so I don’t think it’s going to be a burden to code the both (i.e. copy paste rename an entire routine and get rid of the bs).

An important thing is that once you have set these operations, you have almost everything you need to work with bases as well. Some operations are already implemented.

There are also two routines called localizer and globalizer: these are for now just an identity mappings, but those are thought to be handles to access local indexes starting from global indexes and vice versa (e.g. for distributed vectors).

module stdlib_AbstractVectors
  use stdlib_kinds
  !! 
  !! Module containing the abstract class parent for a 
  !! vector derived type, for different precisions
  !!
  implicit none
  private
  !!
  !! Public types of vector
  !!
  public :: absVector_dp
  
  public :: linearCombination
  public :: projectionOntoBasis
  public :: copy

  interface linearCombination
    module procedure :: linearCombination_Vector_default_dp
    module procedure :: linearCombination_Basis_default_dp
  end interface

  interface projectionOntoBasis
    module procedure :: ProjectionOntoBasis_Vector_default_dp
    module procedure :: ProjectionOntoBasis_Basis_default_dp
  end interface

  interface copy
    module procedure :: Copy_Basis_default_dp
  end interface
  
  !!
  !! real(dp) abstract vector
  !!
  type, abstract :: absVector_dp
     !! Members shared by any operator:     
     !! Number of elements 
     integer :: size = 0
     !!
   contains
     !! Procedure for the initialization of the vector
     procedure(init_vector_interface_dp),         pass(self), deferred :: init
     !! Procedure for the destruction of the vector
     procedure(kill_vector_interface_dp),         pass(self), deferred :: kill
     !! Procedure to scale and add two vectors
     procedure(axpby_vector_interface_dp),        pass(y),    deferred :: axpby
     !! Procedure to scale one vector and add a constant
     procedure(apby_vector_interface_dp),         pass(y),    deferred :: apby
     !! Procedure to set the values
     procedure(dot_vector_interface_dp),          pass(self), deferred :: dot
     !! get procedure 
     procedure(get_value_interface_dp), public,   pass(self), deferred :: getval
     !! set procedure 
     procedure(set_value_interface_dp), public,   pass(self), deferred :: setval
     !! rand procedure 
     procedure(randvalue_interface_dp), public,   pass(self), deferred :: rand
     !
     !  Procedures defined upon axpby
     !! Procedure to scale first vector and add a second 
     procedure,    pass(y) :: axpy => axpy_default_dp
     !! Procedure to perform addition
     procedure,     nopass ::  add => xpy_default_dp
     !! Procedure to perform subtraction 
     procedure,     nopass ::  sub => xmy_default_dp
     !! Procedure to perform copy from x to y
     procedure,   pass(to) :: copy => copy_default_dp
     !
     !  Procedures defined upon axpb
     !! Procedure to set all the values to one given scalar
     procedure,   pass(self) :: setall => set_scalar_default_dp
     !! Procedure to scale a vector
     procedure, nopass ::   scal => scale_default_dp
     !
     !  Procedures defined upon dot
     !! Procedure to compute the norm
     procedure,   pass :: norm => norm_default_dp
     !
     ! Localizer and Globalizer
     procedure, nopass ::  iglob => globalizer_default_dp
     procedure, nopass ::   iloc =>  localizer_default_dp
     !
  end type absVector_dp
  
  abstract interface
     !! 
     !! Abstract procedure defining the interface for a general
     !! initialization scheme
     !!
     subroutine init_vector_interface_dp(self,size,source)
       import dp
       import absVector_dp
       implicit none
       class(absVector_dp), intent(inout) :: self
       integer,             intent(in   ) :: size
       real(dp), optional,  intent(in   ) :: source(:)
       !
     end subroutine init_vector_interface_dp
     !! 
     !! Abstract procedure defining the interface for a general
     !! destructor
     !!
     subroutine kill_vector_interface_dp(self) 
       import absVector_dp
       implicit none
       class(absVector_dp), intent(inout) :: self
       !
     end subroutine kill_vector_interface_dp


     !!
     !!      Procedure : axpby_vector_interface_dp(y,alpha,x,beta)
     !!
     !!      Operation : y <- alpha*x + beta*y
     !!
     !!      Rationale : This routine can be used for all basic operations
     !!                  with two vectors
     !!
     !! Implementation : deferred
     !!
     !!   Input-Output :
     !! 
     !!          y, intent(inout) : absVector  
     !!      alpha, intent(in   ) : real(dp)
     !!          x, intent(in   ) : absVector 
     !!       beta, intent(in   ) : real(dp)
     !!
     !!##### Example
     !!
     !!         ...
     !!         y = [4._dp, 5._dp, 6._dp] 
     !!         call y % axpy( alpha=3._dp, x=[1._dp, 2._dp, 3._dp], beta=1._dp )
     !!         print *, y ! 7, 11, 15
     !!         ...
     !!
     subroutine axpby_vector_interface_dp(y,alpha,x,beta)
       import dp
       import absVector_dp
       implicit none
       class(absVector_dp), intent(inout) :: y
       real(dp),            intent(in   ) :: alpha
       class(absVector_dp), intent(in   ) :: x
       real(dp),            intent(in   ) :: beta
       !
     end subroutine axpby_vector_interface_dp


     !!
     !!      Procedure : apby_vector_interface_dp(y,alpha,beta)
     !!
     !!      Operation : y <- alpha + beta*y 
     !!
     !!      Rationale : This routine can be used to initialize an
     !!                  abstract vector with a single value, or to
     !!                  scale the whole abstract vector with a given
     !!                  scalar, avoiding the allocation of a second 
     !!                  abstract vector (that would be needed with axpby)
     !!
     !! Implementation : deferred
     !!
     !!   Input-Output :
     !! 
     !!          y, intent(inout) : absVector  
     !!      alpha, intent(in   ) : real(dp)
     !!       beta, intent(in   ) : real(dp)
     !!
     !!##### Example
     !!
     !!         ...
     !!         !y = ...
     !!         call y % axpy( alpha=3._dp, 3._dp] )
     !!         print *, y ! 7, 8, 9
     !!         ...
     !!
     subroutine apby_vector_interface_dp(y,alpha,beta)
       import dp
       import absVector_dp
       implicit none
       class(absVector_dp), intent(inout) :: y
       real(dp),            intent(in   ) :: alpha
       real(dp),            intent(in   ) :: beta
       !
     end subroutine apby_vector_interface_dp


     !!
     !!      Procedure : get_value_interface_dp(self,index) result(res)
     !!
     !!      Operation : res <- self(index)  
     !!
     !!      Rationale : Access a specific index of the (possibly local) 
     !!                  abstract vector and return its value
     !!
     !! Implementation : deferred
     !!
     !!   Input-Output :
     !! 
     !!       self, intent(inout) : absVector  
     !!      index, intent(in   ) : integer
     !!        res, intent(  out) : real(dp)
     !!
     !!##### Example
     !!
     !!         ...
     !!         y = [4._dp, 5._dp, 6._dp]
     !!         res = y%getval(2)
     !!         print *, res ! 5
     !!         ...
     !!
     function get_value_interface_dp(self,index) result(res)
       import dp
       import absVector_dp
       implicit none
       class(absVector_dp), intent(in   ) :: self
       integer,                 intent(in   ) :: index
       real(dp)                           :: res
       !
     end function get_value_interface_dp


     !!
     !!      Procedure : set_value_interface_dp(self,index,val)
     !!
     !!      Operation : self(index) <- val
     !!
     !!      Rationale : Access a specific index of the (possibly local) 
     !!                  abstract vector and set the value
     !!
     !! Implementation : deferred
     !!
     !!   Input-Output :
     !! 
     !!       self, intent(inout) : absVector  
     !!      index, intent(in   ) : integer
     !!        val, intent(in   ) : real(dp)
     !!
     !!##### Example
     !!
     !!         ...
     !!         y = [4._dp, 5._dp, 6._dp]
     !!         call y%setval(2, 11._dp)
     !!         print *, y%getval(2) ! 11
     !!         ...
     !!
     subroutine set_value_interface_dp(self,index,val)
       import dp
       import absVector_dp
       implicit none
       class(absVector_dp), intent(inout) :: self
       integer,                 intent(in   ) :: index
       real(dp),         intent(in   ) :: val 
       !
     end subroutine set_value_interface_dp


     !!
     !!      Procedure : dot_vector_interface_dp(self,y) result(res)
     !!
     !!      Operation : res <- self^T*y
     !!
     !!      Rationale : Define the dot product between two abstract vectors
     !!
     !! Implementation : deferred
     !!
     !!   Input-Output :
     !! 
     !!       self, intent(in   ) : absVector  
     !!          y, intent(in   ) : absVector
     !!        res, intent(  out) : real(dp)
     !!
     !!##### Example
     !!
     !!         ...
     !!         y = [1._dp, 2._dp, 3._dp]
     !!         x = [4._dp, 5._dp, 6._dp]
     !!         print *, y%dot(x) ! 32
     !!         ...
     !! 
     function dot_vector_interface_dp(self,y) result(res)
       import dp
       import absVector_dp
       implicit none
       class(absVector_dp), intent(in   ) :: self
       class(absVector_dp), intent(in   ) :: y
       real(dp)                           :: res
       !
     end function dot_vector_interface_dp


     !!
     !!      Procedure : randvector_interface_dp(self,y) result(res)
     !!
     !!      Operation : self <- random_number()
     !!
     !!      Rationale : Define a procedure to randomize the input vector
     !!
     !! Implementation : deferred
     !!
     !!   Input-Output :
     !! 
     !!       self, intent(in   ) : absVector  
     !!       seed, intent(in   ) : integer
     !!
     !!##### Example
     !!
     !!         ...
     !!         [TODO]
     !!         ...
     !! 
     subroutine randvalue_interface_dp(self,seed) 
       import absVector_dp
       implicit none
       class(absVector_dp), intent(inout)           :: self
       integer,                 intent(in   ), optional :: seed(:)
       !
     end subroutine randvalue_interface_dp

  end interface
  

contains


  !!
  !!      Procedure : axpy_default_dp(self,alpha,x)
  !!
  !!      Operation : y <- alpha*x + y
  !!
  !!      Rationale : [TODO] forgot why I cared here
  !!                  
  !! Implementation : Overridable, built upon AXPBY
  !!                  y <- alpha*x + 1*y = axpby(y,alpha,x)
  !!
  !!   Input-Output :
  !! 
  !!          y, intent(inout) : absVector  
  !!      alpha, intent(in   ) : real(dp)
  !!          x, intent(in   ) : absVector 
  !!
  !!##### Example
  !!
  !!         ...
  !!         y = [4._dp, 5._dp, 6._dp] 
  !!         call y % axpy( alpha=3._dp, x=[1._dp, 2._dp, 3._dp] )
  !!         print *, y ! 7, 11, 15
  !!         ...
  !!  
  subroutine axpy_default_dp(y,alpha,x)
    class(absVector_dp), intent(inout) :: y
    real(dp),         intent(in   ) :: alpha
    class(absVector_dp), intent(in   ) :: x
    !
    call y%axpby(alpha,x,1._dp)
    !
  end subroutine axpy_default_dp


  !!
  !!      Procedure : xpy_default_dp(self,x)
  !!
  !!      Operation : y <- x + y
  !!
  !!      Rationale : This procedure can be used to override the (+) operator
  !!                  
  !! Implementation : Overridable, built upon AXPBY
  !!                  y <- 1*x + 1*y = axpby(1._dp,y,1_dp,x)
  !!
  !!   Input-Output :
  !! 
  !!          y, intent(inout) : absVector  
  !!          x, intent(in   ) : absVector 
  !!
  !!##### Example
  !!
  !!         ...
  !!         x = [1._dp, 2._dp, 3._dp]
  !!         y = [4._dp, 5._dp, 6._dp] 
  !!         call y % xpy( x )
  !!         print *, y ! [5._dp, 7._dp, 9._dp]
  !!         ...
  !! 
  subroutine xpy_default_dp(self,x)
    class(absVector_dp), intent(inout) :: self
    class(absVector_dp), intent(in   ) :: x
    !
    call self%axpby(1._dp,x,1._dp)
    !
  end subroutine xpy_default_dp


  !!
  !!      Procedure : xpy_default_dp(self,x)
  !!
  !!      Operation : y <- x - y
  !!
  !!      Rationale : This procedure can be used to override the (-) operator
  !!                  
  !! Implementation : Overridable, built upon AXPBY
  !!                  y <- (-1)*x + 1*y = axpby(1._dp,y,-1_dp,x)
  !!
  !!   Input-Output :
  !! 
  !!          y, intent(inout) : absVector  
  !!          x, intent(in   ) : absVector 
  !!
  !!##### Example
  !!
  !!         ...
  !!         x = [1._dp, 2._dp, 3._dp]
  !!         y = [4._dp, 5._dp, 6._dp] 
  !!         call y % xmy( x )
  !!         print *, y ! [-3._dp, -3._dp, -3._dp]
  !!         ...
  !! 
  subroutine xmy_default_dp(self,x)
    class(absVector_dp), intent(inout) :: self
    class(absVector_dp), intent(in   ) :: x
    !
    call self%axpby(1._dp,x,-1._dp)
    !
  end subroutine xmy_default_dp


  !!
  !!      Procedure : copy_default_dp(to,from)
  !!
  !!      Operation : to <- from 
  !!
  !!      Rationale : This procedure can be used to override the (=) operator
  !!                  
  !! Implementation : Overridable, built upon AXPBY
  !!                  y <- 1*x + 0*y = axpby(0._dp,y,1_dp,x)
  !!
  !!   Input-Output :
  !! 
  !!         to, intent(  out) : absVector  
  !!       from, intent(in   ) : absVector 
  !!
  !!##### Example
  !!
  !!         ...
  !!         x = [1._dp, 2._dp, 3._dp]
  !!         y = [4._dp, 5._dp, 6._dp] 
  !!         call y % copy( x )
  !!         print *, y ! [1._dp, 2._dp, 3._dp]
  !!         ...
  !! 
  subroutine copy_default_dp(to,from)
    class(absVector_dp), intent(  out) :: to
    class(absVector_dp), intent(in   ) :: from
    !
    call to%axpby(1._dp,from,0._dp)
    !
  end subroutine copy_default_dp


  !!
  !!      Procedure : scale_default_dp(self, beta)
  !!
  !!      Operation : self <- beta*self
  !!
  !!      Rationale : This procedure can be used to initialise and to scale
  !!                  
  !! Implementation : Overridable, built upon AXPB
  !!                  self <- 0 + beta*self = axpb(self,beta)
  !!
  !!   Input-Output :
  !! 
  !!       self, intent(inout) : absVector  
  !!
  !!##### Example
  !!
  !!         ...
  !!         x = [1._dp, 2._dp, 3._dp]
  !!         call x % scal( 3._dp )
  !!         print *, x ! [3._dp, 6._dp, 9._dp]
  !!         ...
  !! 
  subroutine scale_default_dp(self, beta) 
    class(absVector_dp), intent(inout) :: self
    real(dp),            intent(in   ) :: beta
    !
    call self%apby(0._dp,beta)
    !
  end subroutine scale_default_dp


  !!
  !!      Procedure : set_scalar_default_dp(self, beta)
  !!
  !!      Operation : self <- beta
  !!
  !!      Rationale : This procedure can be used to initialise and to scale
  !!                  
  !! Implementation : Overridable, built upon AXPB
  !!                  self <- 0*self + beta = axpb(self,0._dp,beta)
  !!
  !!   Input-Output :
  !! 
  !!       self, intent(inout) : absVector  
  !!
  !!##### Example
  !!
  !!         ...
  !!         x = [1._dp, 2._dp, 3._dp]
  !!         call x % setall( 3._dp )
  !!         print *, x ! [3._dp, 3._dp, 3._dp]
  !!         ...
  !! 
  subroutine set_scalar_default_dp(self, beta) 
    class(absVector_dp), intent(inout) :: self
    real(dp),            intent(in   ) :: beta
    !
    call self%apby(0._dp,beta)
    !
  end subroutine set_scalar_default_dp


  !!
  !!      Procedure : norm_default_dp(self) result(res)
  !!
  !!      Operation : res <- || self ||
  !!
  !!      Rationale : This procedure implements the norm of a vector
  !!                  
  !! Implementation : Overridable, built upon DOT
  !!                  self <- sqrt( self^T * self ) = sqrt(dot(self,self))
  !!
  !!   Input-Output :
  !! 
  !!       self, intent(inout) : absVector  
  !!
  !!##### Example
  !!
  !!         ...
  !!         x = [1._dp, 2._dp, 3._dp]
  !!         print *, x % norm(  ) ! sqrt(14)§
  !!         ...
  !!
  function norm_default_dp(self) result(res)
    class(absVector_dp), intent(in   ) :: self
    real(dp)                           :: res
    !
    res = sqrt(self%dot(self))
    !
  end function norm_default_dp

 
  !!
  !!      Procedure : globalizer_default_dp(iloc) result(iglob)
  !!
  !!      Operation : global_index <- local_index
  !!
  !!      Rationale : This routine converts the local index into a global
  !!                  index. It can be useful (for Householder and Givens
  !!                  algorithms) in distributed memory environments. On 
  !!                  shared memory is the identity map.
  !!                  
  !! Implementation : Overridable, identity mapping
  !!                  iglob <- iloc
  !!
  !!   Input-Output :
  !! 
  !!       iloc, intent(in   ) : integer
  !!      iglob, intent(  out) : integer
  !!
  integer function globalizer_default_dp(iloc) result(iglob)
    implicit none
    integer,                 intent(in   ) :: iloc
    !
    iglob=iloc
    !
  end function globalizer_default_dp

  !!
  !!      Procedure : localizer_default_dp(iglob) result(iloc)
  !!
  !!      Operation : local_index <- global_index
  !!
  !!      Rationale : This routine converts the global index into a local
  !!                  index. It can be useful (for Householder and Givens
  !!                  algorithms) in distributed memory environments. On 
  !!                  shared memory is the identity map.
  !!                  
  !! Implementation : Overridable, identity mapping
  !!                  iloc <- iglob
  !!
  !!   Input-Output :
  !! 
  !!      iglob, intent(in   ) : integer
  !!       iloc, intent(  out) : integer
  !!
  integer function localizer_default_dp(iglob) result(iloc)
    implicit none
    integer,                 intent(in   ) :: iglob
    !
    iloc=iglob
    !
  end function localizer_default_dp




  !!
  !!      Procedure : linearCombination_Basis_default_dp(self, coefs, basis)
  !!
  !!      Operation : self <- \sum_{i} basis(i)*coefs(i)
  !!
  !!      Rationale : Procedure implementing linear combinations. 
  !!                  
  !! Implementation : Overridable, built upon DOT
  !!
  !!   Input-Output :
  !! 
  !!       self, intent(inout) : absVector(:)
  !!      coefs, intent(  out) : real(dp)(:,:)
  !!      basis, intent(in   ) : absVector(:)
  !!
  !! 
  subroutine linearCombination_Vector_default_dp(self, coefs, basis)
    class(absVector_dp), intent(inout) :: self
    real(dp),         intent(in   ) :: coefs(:)
    class(absVector_dp), intent(in   ) :: basis(:)
    !
    integer :: i
    !
    if ( basis(1)%size /= self%size ) STOP "Incompatible dimensions"
    call self%setall(0._dp)
    do i = 1, size(basis)
      call self%axpby(coefs(i), basis(i), 1._dp)
    end do
    !
  end subroutine linearCombination_Vector_default_dp


  !!
  !!      Procedure : linearCombination_Basis_default_dp(self, coefs, basis)
  !!
  !!      Operation : self(j) <- \sum_{i} basis(i)*coefs(i,j) for all j
  !!
  !!      Rationale : Procedure implementing linear combinations. Notice that 
  !!                  the coefficients are stored column-wise (i.e. each column
  !!                  contains one set of coefficients for a linear combination)
  !!                  
  !! Implementation : Overridable, built upon DOT
  !!
  !!   Input-Output :
  !! 
  !!       self, intent(inout) : absVector(:)
  !!      coefs, intent(  out) : real(dp)(:,:)
  !!      basis, intent(in   ) : absVector(:)
  !!
  !! 
  subroutine linearCombination_Basis_default_dp(self, coefs, basis)
    class(absVector_dp), intent(inout) :: self(:)
    real(dp),         intent(in   ) :: coefs(:,:)
    class(absVector_dp), intent(in   ) :: basis(:)
    !
    integer :: i, j 
    !
    if ( basis(1)%size /= self(1)%size ) STOP "Incompatible dimensions"
    do j = 1, size(self)
       call self(j)%setall(0._dp)
       do i = 1, size(basis)
          call self(j)%axpby(coefs(i,j), basis(i), 1._dp)
       end do
    end do
    !
  end subroutine linearCombination_Basis_default_dp


  !!
  !!      Procedure : ProjectionOntoBasis_Vector_default_dp(self, coefs, basis)
  !!
  !!      Operation : coefs(i) <- self * basis(i) for all i 
  !!
  !!      Rationale : Project one abstract vector onto a basis of abstract vectors.
  !!                  
  !! Implementation : Overridable, built upon DOT
  !!
  !!   Input-Output :
  !! 
  !!       self, intent(inout) : absVector
  !!      coefs, intent(  out) : real(dp)(:)
  !!      basis, intent(in   ) : absVector(:)
  !!
  !! 
  subroutine ProjectionOntoBasis_Vector_default_dp(self, coefs, basis)
    class(absVector_dp), intent(inout) :: self
    real(dp),         intent(  out) :: coefs(:)
    class(absVector_dp), intent(in   ) :: basis(:)
    !
    integer :: i
    !
    coefs = 0._dp
    if ( basis(1)%size /= self%size ) STOP "Incompatible dimensions"
    do i = 1, size(basis)
      coefs(i) = self%dot( basis(i) )
    end do
    !
  end subroutine ProjectionOntoBasis_Vector_default_dp


  !!
  !!      Procedure : ProjectionOntoBasis_Basis_default_dp(self, coefs, basis)
  !!
  !!      Operation : coefs(i,j) <- self(j) * basis(i) for all i and j
  !!
  !!      Rationale : Project one array of abstract vectors onto a basis of abstract 
  !!                  vectors. Notice that the coefficients are stored column-wise 
  !!                  (i.e. each column contains one set of coefficients for a 
  !!                  linear combination)
  !!                  
  !! Implementation : Overridable, built upon DOT
  !!
  !!   Input-Output :
  !! 
  !!       self, intent(inout) : absVector(:)
  !!      coefs, intent(  out) : real(dp)(:,:)
  !!      basis, intent(in   ) : absVector(:)
  !!
  !! 
  subroutine ProjectionOntoBasis_Basis_default_dp(self, coefs, basis)
    class(absVector_dp), intent(inout) :: self(:)
    real(dp),         intent(  out) :: coefs(:,:)
    class(absVector_dp), intent(in   ) :: basis(:)
    !
    integer :: i, j 
    !
    coefs = 0._dp
    if ( basis(1)%size /= self(1)%size ) STOP "Incompatible dimensions"
    do j = 1, size(self)
       call self(j)%setall(0._dp)
       do i = 1, size(basis)
          coefs(i,j) = self(j)%dot( basis(i) )
       end do
    end do
    !
  end subroutine ProjectionOntoBasis_Basis_default_dp
  !!
  !!      Procedure : Copy_Basis_default_dp(to,from)
  !!
  !!      Operation : to(i) <- from(i) for all i
  !!
  !!      Rationale : Copy one array of abstract vectors onto
  !!                  another array
  !!                  
  !! Implementation : Overridable, built upon COPY
  !!
  !!   Input-Output :
  !! 
  !!         to, intent(  out) : absVector(:)
  !!       from, intent(in   ) : absVector(:)
  !! 
  subroutine Copy_Basis_default_dp(to,from)
    class(absVector_dp), intent(  out) :: to(:)
    class(absVector_dp), intent(in   ) :: from(:)
    !
    integer :: i
    !
    if ( from(1)%size /= to(1)%size ) STOP "Incompatible dimensions"
    do i = 1, size(to)
       call to(i)%copy(from(i) )
    end do
    !
  end subroutine Copy_Basis_default_dp
end module stdlib_AbstractVectors
2 Likes

@kimala thanks for the post. I have a few questions that were not clear to me:

I would recommend adding some motivation to the proposal. Also I would recommend always implementing a prototype and using actual code to figure out the right design.

1 Like

@certik The basic idea is to have some sort of “black box” describing the action of a mathematical operator. This can be useful in several linear algebra libraries, as well as for minimization problems and so on. This mathematical operator can be a function to minimize, a Matrix to invert or another operator to apply (e.g. Laplacian smoothing). If you then have a method that only relies on application of the operator, then you can build the method based on the black box. This allows a lot of flexibility.

In the specific case of iterative methods for linear systems, both the coefficient matrix and the preconditioner can be thought and implemented as black boxes only known for their apply routine. This would allow a potential user to specify whatever preconditioner (and there are some wild ones out there) by extending the abstract class and passing it to the solver. Or to define your coefficient matrix in a “matrix free” way and pass it to the solver.

I hope this clarifies a little bit the motivations.

So to answer you, I would say yes, yes and kindofyes.

  • Yes, it is a proposal for stdlib, so I would like to gather a consensus on naming, procedures and whatever you might think it’s helpful to discuss.
  • Yes. With this kind of object one can build more sophisticated libraries and within these libraries the operator class is only known as a black box.
  • To be honest, I should not propose to standardize the work of someone else. But I am going in the same direction as @loiseaujc and yes the approach taken in LightKrylov is a gem, so I think laying something similar down in stdlib could be useful. I want to stress that this should not be the only approach though, so to not force the user to always extend something. But it will be a good tool to have in the box.

Also, the code I’m presenting here is some n-th iteration and it is actually being tested in multiple procedures. I will try to set up a mwe with some applications of them soon, in the meantime you can check below the implementation fo the PCG with Intrinsic Operators and Abstract Operators/Vectors

(thanks @Beliavsky for the edits, my new markdown editor has no spelling check)

  subroutine stdlib_krylov_pCG_operator_dp( A, K, rhs, x, info, xzero, opts_in )
    !!
    !! In-Out variables
    class(Operator_dp),  intent(in   ) :: A 
    !! Linear operator to be inverted.
    class(Preconditioner_dp),  intent(in   ) :: K
    !! Preconditioner to use.
    real(dp),  intent(in   ) :: rhs(:)
    !! Right-hand side vector.
    real(dp),  intent(inout) :: x(:)
    !! Solution vector.
    integer,   intent(  out) :: info
    !! Information flag.
    logical,   intent(in   ), optional :: xzero
    !! provided initial solution flag  
    type(kr_options),  intent(in   ), optional :: opts_in
    !! Option structure.
    !!
    !! Internal variables
    type(kr_options) :: options
    !! Default options
    real(dp), allocatable :: r(:)
    real(dp), allocatable :: Kr(:)
    real(dp), allocatable :: p(:)
    real(dp), allocatable :: Ap(:)
    !! Residual, search direction and other vectors.
    real(dp) :: alpha, beta
    real(dp) :: p_dot_Ap
    real(dp) :: r_dot_r
    real(dp) :: r_dot_r_old
    !! Scalars used in the CG algorithm.
    real(dp ) :: rhsnorm
    !! Residual and right-hand side norm for exit criterions
    integer :: itnum
    !! Iteration counter
    integer :: infoK
    !integer :: ierr
    !! internal error integer
    integer, parameter :: RCitmax = 5
    !! Residual Correction Iteration number 
    integer :: n, m
    !! Problem size (rows)
    if (present(opts_in)) then
       options = opts_in
    else
       options = default_opts
    end if    
    !! Options to be set

    n = A%get_cdmsize()
    m = A%get_domsize()

    if (options%checkdims) then 

      ! Check if Operator is square
      call stdlib_krylov_is_square(A, info, options%logger)

      ! Check consistency between operator and right-hand side
      call stdlib_krylov_MatVec_consistent(A, rhs, info, options%logger, "A", "rhs")
      
      ! Check consistency between Operator and solution vector
      call stdlib_krylov_MatVec_consistent(A, x, info, options%logger, "A", "x")

      if (info /= 0) return
         
    end if

    ! Compute norm of right-hand side
    rhsnorm = sqrt( dot_product( rhs, rhs ) )
    
    !!
    ! Allocate and initialize support vector
    allocate (Ap, source = rhs)
    Ap = 0._dp

    ! Allocate residual and copy in rhs
    ! ( 2) Evaluate initial residual r_o = rhs ( - A*x_o later)
    allocate (r, source = rhs)

    ! ( 1) Parse whether an initial solution is provided or not
    if ( present(xzero) .and. xzero ) then 
      ! It is provided
      ! ( 1) Keep x_o
      ! ( 2) Evaluate initial residual r_o = rhs - A*x_o
      ! Computes Ap = A * x.
      call A%apply( x , Ap , info)
      r = 1._dp * r + (-1._dp) * Ap 
    else
      ! It is not provided
      ! ( 1) Set x to zero
      x = 0._dp
      ! ( 2) r = b already performed in allocation
    end if
 
    ! ( 3) Allocate preconditioned residual
    allocate (Kr, source = rhs)

    ! "Residual correction" method, see Gambolati-Ferronato. 
    ! Keep RCitnum low (default is 5, which is already high)
    do itnum = 1, RCitmax

      ! (cr1) Apply preconditioner to r
      call K%apply(r, Kr, infoK)

      ! (cr2) Update solution x = x + 1._+dp * Kr.
      x = x + (1._dp) * Kr 

      ! (cr3) Compute A * x (save it on r to help next axpby).
      ! Computes r = A * x.
      call A%apply( x , r , info)

      ! (cr4) Update residual r = rhs - A*x = 1._dp * rhs + (-1._dp) * r.
      r = -1._dp * r + 1._dp * rhs 

      ! (cr5) Compute norm of residual.
      r_dot_r = sqrt(dot_product( r, r ))

      if (associated(options%callback)) call options%callback(itnum, sqrt(r_dot_r), x, .true.)

    end do

    ! ( 3) Apply preconditioner to r to get p0
    call K%apply(r, Kr, infoK)

    ! Allocate direction vector and copy-in p = Kr.
    allocate (p, source = Kr)!, stat=ierr) 
    
    ! ( 4) Compute residual norm
    r_dot_r = dot_product( r, Kr )  
  
    ! Set the info flag to +1, the default non-converged info
    info = 1

    ! ( 5) CG Iteration Loop.
    cg_iterations: do itnum = RCitmax + 1, options%itmax

      ! ( 6) Compute A * p.
      ! Computes Ap = A * p.
      call A%apply( p , Ap , info)

      ! ( 6) Compute p_dot_Ap = (p' * Ap)
      p_dot_Ap = dot_product( p, Ap )

      ! ( 6) Compute step size alpha = (r' * p) / (p' * Ap).
      alpha = r_dot_r / p_dot_Ap

      ! ( 7) Update solution x = x + alpha * p.
      x = x + (alpha) * p 

      ! ( 8) Update residual r = r - alpha * Ap.
      r = r + (-alpha) * Ap 

      ! ( 9) r_k-1 * Kr_k-1 <-- r_k * Kr_k
      r_dot_r_old = r_dot_r

      ! (10) Apply preconditioner to r
      call K%apply(r, Kr, infoK)

      ! (11) Compute residual square length r_k * Kr_k (in the space of K)
      r_dot_r = dot_product( r, Kr )

      ! (12) Check for convergence
      if (associated(options%callback)) call options%callback(itnum, sqrt(r_dot_r), x, .false.)

      if ( sqrt(r_dot_r) < options%tol_dp ) then
        if (associated(options%callback)) call options%callback(itnum, sqrt(r_dot_r), x, .true.)
        print *, "CG: converged", itnum, sqrt(r_dot_r)
           
        ! Set info flag to 0, the default for converged cg
        info = 0
        exit cg_iterations
      end if

      ! (13) Compute new direction beta = - (Kr'* Ap) / (p' * Ap).
      beta = r_dot_r/r_dot_r_old

      ! (14) Update direction p = Kr + beta * p.
      p = beta * p + 1._dp * Kr 


    end do cg_iterations

    deallocate (r, Kr, p, Ap)

    return

  end subroutine stdlib_krylov_pCG_operator_dp
  !!
  !!
  !! Preconditioned Conjugate Gradient Algorithm implementation:
  !!    class(absOperator_dp)-class(absVector_dp)
  !!
  subroutine stdlib_krylov_pCG_abs_dp( A, K, rhs, x, info, xzero, opts_in )
    !!
    !! In-Out variables
    class(absOperator_dp),  intent(in   ) :: A 
    !! Linear operator to be inverted.
    class(absPreconditioner_dp),  intent(in   ) :: K
    !! Preconditioner to use.
    class(absVector_dp),  intent(in   ) :: rhs
    !! Right-hand side vector.
    class(absVector_dp),  intent(inout) :: x
    !! Solution vector.
    integer,   intent(  out) :: info
    !! Information flag.
    logical,   intent(in   ), optional :: xzero
    !! provided initial solution flag  
    type(kr_options),  intent(in   ), optional :: opts_in
    !! Option structure.
    !!
    !! Internal variables
    type(kr_options) :: options
    !! Default options
    class(absVector_dp), allocatable :: r
    class(absVector_dp), allocatable :: Kr
    class(absVector_dp), allocatable :: p
    class(absVector_dp), allocatable :: Ap
    !! Residual, search direction and other vectors.
    real(dp) :: alpha, beta
    real(dp) :: p_dot_Ap
    real(dp) :: r_dot_r
    real(dp) :: r_dot_r_old
    !! Scalars used in the CG algorithm.
    real(dp ) :: rhsnorm
    !! Residual and right-hand side norm for exit criterions
    integer :: itnum
    !! Iteration counter
    integer :: infoK
    !integer :: ierr
    !! internal error integer
    integer, parameter :: RCitmax = 5
    !! Residual Correction Iteration number 
    integer :: n, m
    !! Problem size (rows)
    if (present(opts_in)) then
       options = opts_in
    else
       options = default_opts
    end if    
    !! Options to be set

    n = A%get_cdmsize()
    m = A%get_domsize()

    if (options%checkdims) then 

      ! Check if Operator is square
      call stdlib_krylov_is_square(A, info, options%logger)

      ! Check consistency between operator and right-hand side
      call stdlib_krylov_MatVec_consistent(A, rhs, info, options%logger, "A", "rhs")
      
      ! Check consistency between Operator and solution vector
      call stdlib_krylov_MatVec_consistent(A, x, info, options%logger, "A", "x")

      if (info /= 0) return
         
    end if

    ! Compute norm of right-hand side
    rhsnorm = sqrt( rhs%dot(rhs) )
    
    !!
    ! Allocate and initialize support vector
    allocate (Ap, source = rhs)
    call Ap%setall(0._dp)

    ! Allocate residual and copy in rhs
    ! ( 2) Evaluate initial residual r_o = rhs ( - A*x_o later)
    allocate (r, source = rhs)

    ! ( 1) Parse whether an initial solution is provided or not
    if ( present(xzero) .and. xzero ) then 
      ! It is provided
      ! ( 1) Keep x_o
      ! ( 2) Evaluate initial residual r_o = rhs - A*x_o
      ! Computes Ap = A * x.
      call A%apply( x , Ap , info)
      call r%axpby( (-1._dp), Ap, 1._dp)
    else
      ! It is not provided
      ! ( 1) Set x to zero
      call x%setall(0._dp)
      ! ( 2) r = b already performed in allocation
    end if
 
    ! ( 3) Allocate preconditioned residual
    allocate (Kr, source = rhs)

    ! "Residual correction" method, see Gambolati-Ferronato. 
    ! Keep RCitnum low (default is 5, which is already high)
    do itnum = 1, RCitmax

      ! (cr1) Apply preconditioner to r
      call K%apply(r, Kr, infoK)

      ! (cr2) Update solution x = x + 1._+dp * Kr.
      call x%axpby( 1._dp, Kr, 1._dp)

      ! (cr3) Compute A * x (save it on r to help next axpby).
      ! Computes r = A * x.
      call A%apply( x , r , info)

      ! (cr4) Update residual r = rhs - A*x = 1._dp * rhs + (-1._dp) * r.
      call r%axpby( 1._dp, rhs, -1._dp)

      ! (cr5) Compute norm of residual.
      r_dot_r = sqrt(r%dot(r))

      if (associated(options%callback)) call options%callback(itnum, sqrt(r_dot_r), x, .true.)

    end do

    ! ( 3) Apply preconditioner to r to get p0
    call K%apply(r, Kr, infoK)

    ! Allocate direction vector and copy-in p = Kr.
    allocate (p, source = Kr)!, stat=ierr) 
    
    ! ( 4) Compute residual norm
    r_dot_r = r%dot(Kr)  
  
    ! Set the info flag to +1, the default non-converged info
    info = 1

    ! ( 5) CG Iteration Loop.
    cg_iterations: do itnum = RCitmax + 1, options%itmax

      ! ( 6) Compute A * p.
      ! Computes Ap = A * p.
      call A%apply( p , Ap , info)

      ! ( 6) Compute p_dot_Ap = (p' * Ap)
      p_dot_Ap = p%dot(Ap)

      ! ( 6) Compute step size alpha = (r' * p) / (p' * Ap).
      alpha = r_dot_r / p_dot_Ap

      ! ( 7) Update solution x = x + alpha * p.
      call x%axpby( alpha, p, 1._dp)

      ! ( 8) Update residual r = r - alpha * Ap.
      call r%axpby( -alpha, Ap, 1._dp)

      ! ( 9) r_k-1 * Kr_k-1 <-- r_k * Kr_k
      r_dot_r_old = r_dot_r

      ! (10) Apply preconditioner to r
      call K%apply(r, Kr, infoK)

      ! (11) Compute residual square length r_k * Kr_k (in the space of K)
      r_dot_r = r%dot(Kr)

      ! (12) Check for convergence
      if (associated(options%callback)) call options%callback(itnum, sqrt(r_dot_r), x, .false.)

      if ( sqrt(r_dot_r) < options%tol_dp ) then
        if (associated(options%callback)) call options%callback(itnum, sqrt(r_dot_r), x, .true.)
        print *, "CG: converged", itnum, sqrt(r_dot_r)
           
        ! Set info flag to 0, the default for converged cg
        info = 0
        exit cg_iterations
      end if

      ! (13) Compute new direction beta = - (Kr'* Ap) / (p' * Ap).
      beta = r_dot_r/r_dot_r_old

      ! (14) Update direction p = Kr + beta * p.
      call p%axpby( 1._dp, Kr, beta)


    end do cg_iterations

    deallocate (r, Kr, p, Ap)

    return

  end subroutine stdlib_krylov_pCG_abs_dp

This would be an excellent tool and I would argue that you’re not forcing anyone to always extend something. IMO, you’re laying out a clear abstract API that, if concretized, can leverage a suite of centrally maintained Krylov solvers.

For those that don’t want to use type extension of a base class, Arpack is still an option.

What would be the other paths one could go in modern fortran to accomplish what you’re doing ?

1 Like

Besides type extension, the methods available for injecting behavior are:

  • Procedure arguments (as in SLATEC SLAP, e.g. dcg.f)
  • Procedure pointers (packed in a derived type for tidyness)
  • Reverse communication (just like ARPACK, the venerable templates, or oneMKL RCI ISS)

By means of internal procedures, it’s always possible to convert an interface using one approach into a different one. Internal procedures have some weakness (indirection, executable stack) as explained here: Is creating nested subroutines/functions considered good practice in Fortran? - #25 by szakharin. For realistic problems I’d expect the linear operator apply method to dominate the runtime.

Since there’s already a desire to abstract a linear operator, maybe it makes sense to take it a step further and abstract the vector elements the operator works on too? A generic apply can be provided with a simple interface using arrays for basic cases (which are likely the most common), and an abstract vector type-based interface for more complex scenarios. This way, we keep it simple for everyday use but still have the flexibility to handle fancier stuff when needed.

You are perfectly right: the implementation of the abstract vectors elements (absVector_dp) is proposed in the code block under the paragraph Abstract Vectors of my first post. It only works in conjunction with Abstract Operators (absOperator_dp).

This is also the spot where you can, and want to, introduce optimizations, which IMHO advocates for simplicity of interfacing with the libraries, but I have to admit that the monkey in my brain plays the cymbals when I try to understand reverse communication.

When extending a derived data type (if I’m not mistaken) you can introduce additional dependencies. This means that I could extend my base absOperator_dp and add the module mpi_f08 and create an mpiOperator_dp. In the same way I could extend Operator_dp with openMP, or openACC and so on. Moreover, these extension are standalone from the library and thus easily testable outside. Could reverse communication be as simple/user friendly as this?

1 Like

IMO it is just a matter of choosing your own poison. Some people avoid type extension and deferred bindings, others don’t like reverse communication. I think @victotronics has described reverse communication fairly when he says:

The idea is to make the function data-independent by keeping
the data entirely outside its scope. Thus for every operation on that data
you have to return, with a request parameter indicating what
operation has to be performed outside, and you have to have a way
to indicate where you resume after the operation is done.

I’ve written code like this, and it’s kinda neat, but I’m not
entirely unconvinced that it’s not totally evil. Or not.
[emphasis added]

Since LIghtKrylov has been mentioned by @certik and served as inspiration for @kimala, let me add my two cents to this discussion.

Adding an abstract class for linear operators (and vectors) to stdlib is something I have suggested a couple of times already, although I was too busy working on LightKrylov to be a full-time advocate for it. I truly believe that it would be beneficial for the scientific Fortran environment as a whole to have such abstract types provided by stdlib which users can extend for whatever they want. If user-defined packages (say for linear algebra for instance since this is the question here) expose the same API, it might actually give rise to a nice inter-operability between possibly very different, albeit complementary, packages. Such linear algebra packages, extending some well-defined abstract types and conforming to some established API, could then be inter-changed easily when developing a convex optimization library (which is something I’m planning to do shortly). Having programmed quite a bit in Julia a few years ago, this is something I’ve enjoyed quite a lot.

Regarding the second point about LightKrylov: I’ve started working on LightKrylov because the kind of computations I wanted to do (solving very large-scale Lyapunov and Riccati equations) couldn’t be done otherwise. After having played around a bit in Julia, going for the abstract route seemed fairly natural and allowed me to plug LightKrylov to several CFD codes we use in the lab extremely easily, although each relies on very different numerical methods or architectures (from spectral elements, to finite differences, incompressible Navier-Stokes to high Mach number simulations, etc). Yet, after finding about stdlib, I’ve always had in mind that I’d be more than happy to transfer any bit of LightKrylov that would be useful to a larger community.

These are all valid ways to proceed. As far as I’m concerned, I think that, to some extent, going for the abstract route makes the code look more like typical pseudo-code you’d find in linear algebra papers. I also find it way easier on the developer side, although you have to have in mind how to make it easy for end-users to extend the types for their applications. But once more, that maths is what guided me for LightKrylov. Except for a couple of additional type-bound procedures (e.g. how to create a random vector), all the other type-bound procedures either emulate some BLAS-1 operations or what actually define mathematically an element of a vector space.

5 Likes

I may be “fancy”. I rarely use rank 1 arrays to store solution variables.

There are some things to beware with this approach to low-level linear algebra operations. First, many operations involve only a single read and/or single write of each vector element, combined with maybe one or two floating point operations. Those can be done efficiently in a normal do loop (perhaps with strip mining and unrolling for efficiency) with all intermediate quantities stored in registers, not vectors. But with the above approach, there is an expensive allocate() statement, the effort for which can be many times larger than the simple loop when the allocation is from a heap. Even if the temporary array is automatic, with allocation typically on the stack, it is still an expensive operation, plus this introduces the possibility of stack overflow errors at run time. Many important operations are actually sparse, and only require access to a small number of elements in the vector. One example is a plane rotation, which involves access to only a pair of elements within the vector. There are also examples involving higher-level matrix operations that require access to only O(N) elements out of the total N*N elements in the matrix. A rotation between two rows or columns of a matrix is an example of this. Next is the call to the BLAS-like subroutines scal() and axpy(). These both have what look like increments in them, which suggests that the dummy arguments are contiguous assumed-size arrays. This argument association can invoke copy-in/copy-out operations if the actual assumed shape arguments are not contiguous, further increasing the computational effort beyond what a simple loop could achieve.

There are some simple answers to some of these issues, but not to all of them. This is why, for example, the LAPACK library uses relatively few modern fortran features and still looks like 40 year old f77 code. The authors and maintainers have never agreed on how such modern features can be incorporated into this legacy library.

1 Like

Neither do I. I have worked quite a lot with the spectral element solver Nek5000, implementing large-scale eigenvalues/svd solvers to study the spectral properties of the linearized Navier-Stokes operator. In this case, my extension of abstract_vectors typically look like that

type nek_vector, extends(abstract_vector)
    ! Velocity arrays.
    real(dp) :: vx(nx, ny, nz, ne), vy(nx, ny, nz, ne), vz(nx, ny, nz, ne)
    real(dp) :: pr(nx2, ny2, nz2, ne)
    real(dp) :: theta(nx, ny, nz, ne, nps)
contains
    ! All the type-bound procedures.
end type

where the first three indices correspond to the indices of Gauss-Lobatto-Legendre points in a given element, the fourth one to the spectral element index, and nps loops over the different additional scalar fields I may have in my simulation (e.g. temperature, concentration, passive tracer, etc).

1 Like

Even though I’m a big fan of using a relatively high abstraction level when implementing generic math procedures (whether it is linear algebra, convex optimization, or others), I do need to point a couple of things that one needs to keep in mind when in comes to performances and code development:

Using an abstract vector type might require the developer to partially re-invent the wheel in some cases just to accommodate the fact an abstract route has been taken. As an example, suppose you have an array of such abstract vectors and you would like to orthogonalize them. Since, on the abstract package side, you have no idea what kind of data structure a user might use to represent their notion of a vector, you are forced to implement linear algebra algorithms using only the basic operations defining a vector. For the case of orthonormalizing a set of abstract vectors, it implies you have to implement an abstract version of the modified Gram-Schmidt process for instance. If you want to make sure your basis has full rank, it is the pivoted QR decomposition that you need to re-implement despite the latter being available in lapack had you use standard rank-2 arrays, etc. It also precludes de-facto some algorithms which, for efficiency (and as @RonShepard mentionned), make explicit use of the rank-1 or rank-2 representations. For the task of orthogonalization for instance, I don’t see any way to use Householder reflections with abstract vectors although they are known to be more numerically stable than MGS.

Another thing is that all the actual computational efficiency has to come from the user. They need to make sure their matrix-vector product is implemented as efficiently as possible if they want good performances. While it drastically lighten the burden on the developer’s shoulders, it also comes at the price that even users cannot leverage all the possible performance tricks offered by simpler data structure. A simple example is the matrix-matrix product Y = A X, where A is an abstract linear operator, and X and Y are arrays of properly defined abstract vectors. Were you to use built-in rank-2 arrays for each of these matrices, you could then leverage BLAS-3 operations, yet, since X and Y are arrays of abstract vectors, it is most likely that you’d have to implement this matrix-matrix product as an iteration over matrix-vector products and, in the best case, only leverage BLAS-2 operations.

I do believe though that these form a relatively small price to pay (if you are well-aware about it) for a massively increased flexibility. One of the major benefit I’ve seen from developing/working with LightKrylov for instance is that the abstract code needs to know absolutely nothing from the mpi stuff. Anything mpi-related comes from the user’side and is contained in the definition of the actual non-abstract matrix-vector product and inner product. This enabled us to use LightKrylov with codes as simple as computing the time-dependent controlability Gramian of the 1D Ginzburg-Landau equation on my laptop all the way to computing the leading eigenvalues and eigenvectors of the linearized Navier-Stokes operator for a fully three-dimensional flow using thousands of processors. Nothing had to be changed in LightKrylov for either case, only the actual type extensions in each of the concerned solvers.

1 Like

I agree with you and this point has kept me up at night for long. However, I think this can be solved with good documentation. The key point is that the user is not supposed to use the default implementations. The default implementations are overridable, and the user shall override them all. However, asking the user to implement too many deferred procedures is a bad choice in my opinion (the code will complain that the type will be abstract until everything is coded) so I would rather have a basic number of deferred procedures to start with and start relatively soon to test my implementation. Take also into account that at this point, if the user needs abstract operators and abstract vectors, they are pretty far down the rabbit hole, so it makes sense to ask them to implement everything, but also it makes sense to let the user build their abstract sets incrementally replacing the default procedures.

One option could be to make the default routines verbose, logging out a statement that you need to override them. Another option is to have a checklist of the things you need to implement/override before going to production codes. (I would advocate for both)