FSPARSE: A modern fortran library for sparse matrices - Call for contributors

Dear all,

I would like introduce FSPARSE, which is a library I started to put together for facilitating handling sparse matrices with a minimal OOP touch for data-containment in a pure Fortran environment.

While the project is still very much a work in progress it already contains:

  • Types for COO/CRS/CSC/ELLPACK formats
  • Several matvec kernels, including for some symmetric representations
  • Transformation between dense<>COO ; COO<>CSR

Before pushing any further I decided to announce it in order to get some early feedback on the architecture, and see if there are some interested contributors that would like to join and help get this in shape for the rest of the community :slight_smile:

The idea(s) I’m striving for:

  • It shall be a modern approach such that any new comer to the topic can come here to learn (in a Fortranic manner) how to manipulate sparse matrices. Meaning: first use it, then learn how it works. (à la scipy.sparse from where I got a lot of inspiration)
  • It shall contain kernels that are performant enough such that it can be used as is, yet easy enough that it can be readable and understandable … Not always easy but well, one can always try :slight_smile: … extendable such that one can chose to use any of the already available libraries out there to replace/bypass the kernels ( Eigen, PSBLAS, PETSc, etc ). So, to use the library as a data-descriptor and let other libs do the heavy-lifting.

This idea is being discussed on the stdlib side as well, I just felt such a project would benefit from some “freedom” for testing ideas and once a higher maturity is reached, make the migration to stdlib.

There are several TODOs, from which I thought about a few that would be very helpful:

  • Discuss the OOP design (not the fact of using OOP please, but the chosen architecture and how to minimize pitfalls and maximize user-friendliness maintaining performance … the current design comes from a working implementation I already use for HPC code and that has proven quite versatile but I’m looking for challenging it)

Ex: as you will see in the readme, I decided to create several types per matrix like:

|Matrix type |no data|float     |double    |
|COO         |`COO_t`|`COOr32_t`|`COOr64_t`|

Why a ‘no data’? some times one just wants to do ‘graph’ or topological transformations without needing to worry about attributes, if so, this “no data” offers a light-weight container with only the connectivity. Does it look reasonable? Would you have done differently?

  • Include in the lib a modern version of the MatrixMarket and Harwell-Boeing files formats: This would then enable using for instance the http-client for easy benchmarking, and unit-testing.
  • Suggestions for the unit tests.
  • Complete the conversion between types. Yet to implement for instance COO<>ELL.
  • Add basic (modernized) renumbering algorithms (CM / RCM / GPS)
  • Other ideas?

As mentioned in the title, this is a call for contributors, so you are very much welcome :slight_smile:

Looking forward to your comments and contributions

19 Likes

Hi @hkvzjal,
I am very much interested in this project.
Only thing that was holding me back is - sparse matrix algorithms are mostly “Graph” data structure based. Therefore C and C++ shine a lot in this area. But I am sceptical how well such data structures can be developed in Fortran which are efficient (computationally) as C counterparts.
For some time I was just thinking to develop something in C with an exclusive intent of making it completely useable from Fortran. These are just my thoughts.
Anyway let us discuss how to take FSPARSE further.

1 Like

Hi @Ashok thank you for your interest in the project! looking forward to discuss about it!!

My opinion is that these methods shine so much in C/C++ not because of the fact of handling graphs, for which I have found no limitations in Fortran as far as the use cases I have faced, but mainly because of the shear size of the user community of C/C++ which dwarfs Fortran’s. One can already find many facilities built say, in the Boost library and many others.

Yet, when one starts looking at the original papers/books… many of the foundational algorithms where written in FORTRAN77, then translated in C/C++. Take the bible of sparse linear systems by Yousef Saad https://www-users.cse.umn.edu/~saad/IterMethBook_2ndEd.pdf and the accompanying source codes in SPARSEKIT2 The algorithms are written in FORTRAN in the book.

The references are countless, but just to say that there is no intrinsic limitation to the language, and all the facilities of Modern Fortran make it just as fit as any to handle most of the complex use-cases easily enough. I think it is mainly related to a historical reasons… Yet, the team of PSBLAS that I cite above, coded most of their package in Fortran, quote from their page:

The PSBLAS library version 3 is implemented in the Fortran 2003 programming language, with reuse and/or adaptation of existing Fortran 77 and Fortran 95 software, plus a handful of C routines

So, I think that it is feasible to build a general purpose sparse kit with Modern Fortran in sight as an entry-level library. @certik started a proposal here https://github.com/certik/stdlib/blob/csr/src/stdlib_experimental_sparse.f90 for the low-level non-OOP kernels. I took the inverse route having in mind the end user to drive the design. Both approaches are necessary I think.

Let me know which use-cases do you have in mind and lets try to get it working :slight_smile:

4 Likes

Hi @hkvzjal,
Thanks for your insightful reply. Maybe I was carried away by community strength of C/C++ and never tried in Fortran. Let us start implementing in Fortran then.
My use case if for Finite Element Analysis of structural analysis problems (Static, transient, buckling) problems. I wanted to write a self sufficient Modern Fortran library for Non-linear FEM - where code clarity is given higher preference than functionality. Target is especially grad students who can get more clarity in the implementation of these methods and then can extend the functionality. That’s my goal - of course a big one- but let’s see how far I will go.

2 Likes

Dear @hkvzjal , hi.
Let me first tell you how grateful I am that someone is implementing a sparse kit for fortran, as someone that did part of his research in numerical linear algebra having to implement from scratch CSR notations and operations.

Anyway, I’m taking this opportunity to propose a new idea to the whole community: In every project you work on you make particular choices in the design, you prefer some paradigm rather than others. As a newbie, while I understand what is going on in the code I struggle to see the big picture and why one choice is preferable rather than others. I think that dedicating part of the discussion to explain the design choices (up to a certain degree of accuracy of course, I don’t expect you to go line by line :sweat_smile: ) is going to be an important feature for this forum and this community. I had the same thoughts a couple of days ago seeing the post of @dubos in another thread, and I wanted to start asking several questions about his GFD code structure (perhaps I will do it at the next PDEonthesphere :wink: )

Thank you in advance for sharing any information about your project

1 Like

Hi,

It would be great to incorporate the work that has already been done for sparse matrices in Fortran in the fantastic SPARSEKIT package by Yousef Saad:
https://www-users.cse.umn.edu/~saad/software/SPARSKIT/

We have utilized some code from there in our production codes, including the ILU0 decomposition and forward-backward solvers, as well as the DIA format and conversions (dia2csr etc.).

Assuming licensing/code-sharing issues are OK, I think it would be great to use as a reference for a lot of the formats and algorithms, which should speed up implementation time for the new package.

– Ron

1 Like

Hi @kimala, thanks for the kind words!! this is the same feeling I got when I discovered the Fortran community, fpm, and so many more projects being built around it!!

This is a totally valid point! I’ll try my best to explain the idea. Just for the anecdote, I would say this is my third iteration on the design pattern for this project that I had in mind since several years ago.

The first point would be that having an OOP API for sparse matrices is very handy when one wants to: recycle the graphs, be it for building the sparse matrices for linear solvers, or for topological operations on meshes.

Then, why the particular approach I took? I’ll use the COO definition to explain the idea:
(see the file fsparse_matrix_gallery.f90)

    type, abstract :: sparse_t
      integer :: base = 1 !! index base = 0 for (C) or 1 (Fortran)
      integer :: nrows= 0 !! number of rows
      integer :: ncols= 0 !! number of columns
      integer :: NNZ  = 0 !! number of non-zero values
      integer :: sym  = k_NOSYMMETRY !! assumed storage symmetry
    end type
  
    !! COO: COOrdinates compresed format
    type, extends(sparse_t) :: COO_t
      logical               :: isOrdered = .false. !! wether the matrix is ordered or not
      integer, allocatable  :: index(:,:) !! Matrix coordinates index(2,nnz)
    contains
      procedure :: malloc => malloc_coo
    end type
  
    type, extends(COO_t) :: COOr32_t
      real(sp), allocatable :: data(:) !! single precision values
    end type
  
    type, extends(COO_t) :: COOr64_t
      real(dp), allocatable :: data(:) !! double precision values
    end type

The idea of the basis abstract type type, abstract :: sparse_t is to simply have a common denominator on the attributes a sparse matrix should have, yet not allowing to instantiate this class, as it would make no sense to use it as is

The derived COO_t class, would already contain 2 members, the indexes or coordinates of the matrix and a simple boolean to identify whether the indexes have been ordered or not: This type already enables doing graph operations without having to worry about the data attribute. Also, one can use to “throw in” connectivity’s from say a mesh, and order it afterwards. I have found that this is much more efficient than trying to built an ordered matrix every time one inserts a node-pair.

Having the COOr32_t COOr64_t enables having also the data attributes which for FEM would be the actual content of the matrix, or in the GNN world, the attributes of the graph… Those would have been compacted with PDTs but since this feature of Fortran is kind of half-baked I stopped pursuing its use (would have loved to use it more though)

1 Like

That statement might have been true for f77 due to its lack of pointers and dynamic memory allocation, but modern fortran has all the necessary features to implement graphs efficiently and clearly. In fact, I would say that allocatable arrays give modern fortran a unique feature that can be used to advantage for network structures.

1 Like

Hi @sumseq, the SPARSEKIT and SPARSEKIT2 packages are reference one should always have when doing sparse algebra. I avoided using directly the implementations therein for 2 reasons: License and norm of the language used. This is more of a personal taste, but I would like to see modernized versions of the algorithms (use of Fortran norms > 2003), not only interfaces to the FORTRAN77 ones. I don’t know if it will be possible but one can always dream :slight_smile: and well, maybe with some help from the community it would be possible.

This can be included quite easily in the current architecture!

Exactly, the combo allocate / move_alloc enable some real magic that I’m very fond of!!

Hi,

I guess what I was suggesting is a conversion of a lot of SPRASEKIT into your new OOP modern Fortran package. I suppose the licencing may be an issue there, but it would still be a great guide as to what formats and algorithms to re-implement.

– Ron

1 Like

@sumseq besides the DIA format, could you mention a couple of addittional features you would deem important? Say a “must have” and a “nice to have” in such API?

I would say the standard formats such as DIA and ELL.
Also, the triangular solvers LSOL and USOL for all formats.
The ILU algorithms are really important too, with ILU0, MILU0, RILU0.
Also see this “bug fixed” version of the package:

– Ron