Optimal way to code getting spectra of matrices?

I have thoughts below, but my overall questions are:
What is the easiest way to get eigenvectors/eigenvalues of a generic (diagonalizable) matrix in Fortran for the purposes of prototyping? Additionally, for performant code in large codebases that have a large need for this functionality, what is the best way (with good programming practices) to do this?

I recently found, from prototyping an algorithm in Fortran, something to dislike, compared to say Matlab or Python:
The difficulty in finding eigenvalues and eigenvectors of dense matrices. I’ll share my thoughts below, and pose a question.

Context: I was prepping to give a Fortran tutorial (a second part) given to a particular community. Its focus was on scientific computing. An extended example I had in mind included an eigenvalue/eigenvector solve in the algorithm. An online search seemed to tell me LAPACK was the easiest way to go for this. Downloading and compiling LAPACK is fine, linking them a little annoying but fine, but actually using them is quite annoying compared to Es, Vs = np.linalg.eig(A). I found (in my limited time) there was no good way to introduce this functionality to the people I was giving the tutorial for, so that they could meaningfully follow along themselves during the active tutorial.

For prototyping
The annoyance/difficulty in calling LAPACK routines is a big hamper to my desire to prototype algorithms in Fortran, or teach people anything involving spectra of matrices with Fortran. Fortran’s in-built dot_product and matmul routines are great; it would be great to have an in-built eigen() routine, or an easily importable library with that functionality. It doesn’t have to even be nearly as performant as LAPACK, but it would make prototyping a lot quicker. As it is, eigenvalue calculations are an extremely common part of science, and if I put on my ‘instructor hat’, I would never have students use Fortran to do this when Matlab or Python can do it much more easily. It’s kind of sad when, for eigenvalue/eigenvectors, Matlab/Python really can do “Formula Translation” unlike Fortran, especially when they’re usually calling LAPACK themselves!
For performance
To have speed, using LAPACK or equivalent is inevitable. Using Google, it seemed for dense matrices it was best to use LAPACK. For optimal performance, one should perform the workspace query for LAPACK routines, and re-use that information. The workspace also means carrying around an extra two vectors for SSYEV (or more depending on the routine, like SSYEVR), plus several extra scalars. For performance, this is necessary but unwieldy.
For maintenance and readability of large codebases
Many of Fortran’s unique features as a language are useful for increasing readability of large codebases. Declarations of intent(in/out), separation of functions and subroutines, existence of the ‘pure’ and ‘elemental’ keywords, ability to name constructs and exit nested ones at any level of nesting, etc. It almost seems like Fortran is built for HPC with large codebases.
This is amazing, as large codebases need a lot of help in code readability for future modification and maintenance. LAPACK routines, however, are an obscure language of their own. For readability, I’d need to wrap the Fortran routines I need in a nicer wrapper that users/future programmers don’t need to worry about. For performance, however, I need to keep all the extra workspace-related baggage that LAPACK has. I’d wrap the variables into a derived type, probably, and instantiate different objects if I’m using LAPACK routines on different sized matrices, but re-using the workspace variables when I can. The names would be chosen for readability. However, writing my own wrappers and types every time seems silly. At least high-performant libraries like FFTW have you create a ‘plan’ containing the equivalent information for this. Creating a ‘compute plan’ is also now a large part of the CUDA framework, including recent features like CUDA Graphs. Perhaps this could be explored.

I’m not sure what the Fortran community can do about LAPACK in particular, but if people are thinking of adding eigenvalue/eigenvector solvers to the Fortran Package Manager (or standard library), or at least making modern wrappers to LAPACK easily available in fpm, the above may be something to think about.

Rant over!
Again, my questions are about: the easiest way to get eigenvalues/eigenvectors for prototyping in Fortran (is it really LAPACK?), and the best way to do it for large codebases that need good and robust performance.

6 Likes

Two resources that I can remember belong to John Burkardt’s codebase:

  1. Eigenvalues and Eigenvectors of a Symmetric Matrix
  2. ARPACK

Numerical recipes in Fortran by Press et al. also have simple implementations of eigensolvers. I agree that using LAPACK for simple daily usages or even high-level libraries is a bit overkill.

The fortran-utils package from @certik has a straightforward wrapper for calculating eigenvalues of general square matrices:

subroutine eig(A,lam,c)
  real(sp), intent(in) :: A(:,:)     ! Matrix
  complex(sp), intent(out) :: lam(:)  ! Eigenvalues A c = lam c
  complex(sp), intent(out) :: c(:,:) ! Eigenvectors
end subroutine

There are several open issues about including linear algebra subprograms in stdlib where help is needed, e.g.:

There are also some recent papers addressing this problem at a higher/lower level depending on how you look at it:

The second paper introduces the tool Linnea (essentially a domain-specific compiler) where given an algebraic input expression, the compiler generates the necessary LAPACK calls. Currently only Julia is supported as the output language.

Quoting the words of Salvatore Fillipone (author of PSBLAS):

Your code can

  • Be quick to write and compact
  • Be very fast
  • Require the minimum possible amount of memory

but you only get to pick two…

5 Likes

This is precisely the motivation behind MATLAB, quoting from “The History of MATLAB”:

In the 1970s and early 1980s, while he was working on the LINPACK and EISPACK projects, Moler was a Professor of Mathematics and then of Computer Science at the University of New Mexico in Albuquerque. He was teaching courses in linear algebra and numerical analysis. He wanted his students to have easy access to LINPACK and EISPACK functions without writing Fortran programs. By “easy access” he meant avoiding the remote batch processing and the repeated edit-compile-link-load-execute process that were ordinarily required on the campus central mainframe computer. This meant the program had to be an interactive interpreter, operating in the time-sharing systems that were becoming available.

Also quoting from the GNU Octave pages:

Octave was originally conceived (in about 1988) to be companion software for an undergraduate-level textbook on chemical reactor design being written by James B. Rawlings of the University of Wisconsin-Madison and John G. Ekerdt of the University of Texas. We originally envisioned some very specialized tools for the solution of chemical reactor design problems. Later, after seeing the limitations of that approach, we opted to attempt to build a much more flexible tool.

There were still some people who said that we should just be using Fortran instead, because it is the computer language of engineering, but every time we had tried that, the students spent far too much time trying to figure out why their Fortran code failed and not enough time learning about chemical engineering. We believed that with an interactive environment like Octave, most students would be able to pick up the basics quickly, and begin using it confidently in just a few hours.

The troubles in using LAPACK (or linear algebra subprograms in general) are also not limited to Fortran, but affect also other HPC languages like C/C++ as I’ve written previously. There has been some work in C++ to try and get linear algebra into the language/standard library:

There are also new libraries for dense linear algebra like libflame, magma, and plasma all of which only offer “expert” interfaces as far as I’m aware. I haven’t looked at them closely enough to know whether they use compute “plans” or mostly follow old LAPACK conventions.

Some other threads at this discourse (one and two) discuss reasons why linear algebra libraries appear to be moving away from Fortran.

1 Like

@adenchfi what you wrote is precisely why I decided to get involved in Fortran development itself. I come from the same position as you just described. The situation is simply unacceptable to me, for the reasons you said.

Here is what I want:

  • I want to use Fortran like you can use Matlab or Python: that is, interactively and have beautiful simple (but performing!) functions to deal with linear algebra among other things.

  • To compute eigenvalues should be done by lam = eigvals(A), to compute eigenvalues and eigenvectors should be call eig(A, lam, c) from an interactive environment

So to fix that:

  • I have started the LFortran project that is interactive, but also can be used as just a regular compiler. It works in Jupyter and right now we require to declare all types. But we can relax it and even infer types from the right hand side, like in Python. I don’t know yet if this is a good design, so it is not done by default yet. But I am mentioning it to show that technically we have the means to make Fortran as easy to use as Python for prototyping.

  • For simple interfaces to linear algebra, as @ivanpribec mentioned, we have that in fortranutils, and I am hoping to get the community to agree and to help get it into stdlib, with an agreed upon interface.

  • Fortran Package Manager (fpm): the last missing piece of the puzzle is fpm. We need it to be super simple to install these 3rd party libraries (including in the interactive LFortran prompt!), so that when you get a newcomer, you can get them up and running in seconds (and robustly on all platforms!), just like in Python.

So this will happen. If you are interested to help us out to get there faster, please do, we would really appreciate it.

Here is my offer to anyone reading this: I am happy to do a video call to discuss how you can most effectively help, and get you up to speed to contribute to our efforts the most effectively. Just get in touch with me, privately or publicly.

4 Likes

Lapack95 is a Fortran95 interface for LAPACK. To compute eigenvalues of a symetrical matrix call la_syev(A,w), for example. Although it would be nice to update the interfaces with the most modern Fortran features…

3 Likes

@pjs thanks for the link and welcome to the Forum! Indeed, the idea is not new.

1 Like

A convenient way to compile LAPACK is to use John Burkardt’s single-file free source form double precision version here with driver code here. An executable program is created with gfortran -fallow-argument-mismatch lapack_d.f90 lapack_d_test.f90. The code (listed below) that calls LAPACK to compute eigenvalues and eigenvectors is indeed overly complicated compared to Matlab or SciPy, although it has been pointed out that the Lapack95 interface is simpler.

subroutine dsyev_test ( )
!! DSYEV_TEST tests DSYEV.
!  Licensing:
!    This code is distributed under the GNU LGPL license.
!  Modified:
!    12 September 2006
!  Author:
!    John Burkardt
  implicit none
  integer ( kind = 4 ), parameter :: n = 7
  integer ( kind = 4 ), parameter :: lwork = 3 * n - 1
  real ( kind = 8 ) a(n,n)
  integer ( kind = 4 ) info
  character jobz
  real ( kind = 8 ) lambda(n)
  character uplo
  real ( kind = 8 ) work(lwork)
  write ( *, '(a)' ) ' '
  write ( *, '(a)' ) 'DSYEV_TEST'
  write ( *, '(a)' ) '  DSYEV computes eigenvalues and eigenvectors'
  write ( *, '(a)' ) '  For a double precision real matrix (D)'
  write ( *, '(a)' ) '  in symmetric storage mode (SY).'
  write ( *, '(a)' ) ' '
!  Set A.
  call clement2 ( n, a )
  call r8mat_print ( n, n, a, '  The matrix A:' )
!  Compute the eigenvalues and eigenvectors.
  jobz = 'V'
  uplo = 'U'
  call dsyev ( jobz, uplo, n, a, n, lambda, work, lwork, info )
  if ( info /= 0 ) then
    write ( *, '(a)' ) ' '
    write ( *, '(a,i8)' ) '  DSYEV returned nonzero INFO = ', info
  else
    call r8vec_print ( n, lambda, '  The eigenvalues:' )
    if ( jobz == 'V' ) then
      call r8mat_print ( n, n, a, '  The eigenvector matrix:' )
    end if
  end if
  return
end

Thanks for the replies everyone!

Thanks @ivanpribec for the links to other discussions, and historical findings. The MATLAB and Octave ecosystems make sense as a friendlier approach for prototyping linear algebra programs. I like the quote from Fillipone, it seems quite accurate. For prototyping, you might only need the first (Quick to write and compact). For prototyping in Fortran, we could have at least the first two. Large codebases need the last two points for sure, but that isn’t what students need, and it seems to me that students determine the adoption of the language.

I actually did not know about the fortran-utils package by @certik which seems very interesting. I will have to explore it; it seems easy enough to copy over, its usage for computing eigenvalues/eigenvectors seems excellently easy, and I suppose it isn’t too much to ask students to git clone the repository for the purposes of a tutorial. Compilation doesn’t seem that bad either. I see there’s great work going on in this direction!

@certik I do think being able to prototype things involving eigenvalues/eigenvectors is important, but I personally am okay with a non-interactive method of prototyping. I’ve looked at LFortran before, and it does seem promising for interactive sessions. I personally love Jupyter notebooks for pedagogical reasons; maybe I’ll actually try using LFortran soon!

We need it to be super simple to install these 3rd party libraries (including in the interactive LFortran prompt!), so that when you get a newcomer, you can get them up and running in seconds (and robustly on all platforms!), just like in Python

Fully agreed there.

@pjs I did find LAPACK95 before posting here, which gave me hope, but it was extra steps on top of installing LAPACK and I didn’t want to spend half of my tutorial time on library installation.

From some of the linked past discussions, it appears pure Fortran-written code still falls quite short of equivalent BLAS/LAPACK routines written in a combination of C/Assembly. If Fortran no longer makes the fastest code (without absurd development time pairing it with Assembler), I can see why it’s dropped in popularity even in HPC environments. However, from what I understand, the do concurrent construct is intended to be an indicator to the compilers to more aggressively unroll loops, apply SIMD instructions, or otherwise run it in parallel. I think this direction is great, since it seems compilers still need assistance from the programmer. If we can give that assistance in Fortran rather than going to assembly, great!
Anecdotally, I’ve personally been writing kernels in CUDA Fortran, and having control over (part of) the on-SM shared memory (L1 cache on the GPU) is a powerful thing for writing efficient code, and is something I wish I could do for Fortran CPU code. While writing optimized Assembly for different processors can achieve the same effect, I’ve been told one can actually specify what’s loaded in L1 for different CPUs at once by using OpenCL.

So this will happen. If you are interested to help us out to get there faster, please do, we would really appreciate it.

I will consider it! I am enjoying using Fortran more than the other languages I’ve used in the past, but am still learning it and would like to consider myself an expert first before engaging in anything like this.

4 Likes

The LAPACK Helper module provides simple interfaces for a few LAPACK subroutines, including the eigenvalues and eigenvectors of a dense matrix. Subroutines are provided to print only the real parts of results when the imaginary parts are zero. Here is a calling code I wrote:

program xlapack
use lapackMod, only: eigenvalues, print_eigenvalues, print_eigenvectors
implicit none
integer, parameter :: n = 3
real               :: a(n,n),wr(n),wi(n),vr(n,n)
integer            :: i,j,iter
real, parameter    :: off_diag_mult = 0.7
do iter=1,3
   write (*,*)
   if (iter == 1) then
      write (*,*) "diagonal matrix"
      a = 0.0
      forall (i=1:n) a(i,i) = i**2
   else if (iter == 2) then
      write (*,*) "symmetric matrix"
      forall (i=1:n,j=1:n) a(i,j) = merge(1.0,off_diag_mult,i==j)*i*j
   else
      write (*,*) "dense matrix"
      call random_number(a)
      a = a - 0.5
   end if
   write (*,"(/,'matrix:')")
   do i=1,n
      write (*,"(1000f10.5)") a(i,:)
   end do
   call eigenvalues(a,wr,wi,vr)
   call print_eigenvalues("eigenvalues",wr,wi)
   call print_eigenvectors("eigenvectors",wi,vr)
end do
end program xlapack

Compiling with gfortran lapack_helper.f90 xlapack_helper.f90 liblapack.a librefblas.a and running gave

 diagonal matrix

matrix:
   1.00000   0.00000   0.00000
   0.00000   4.00000   0.00000
   0.00000   0.00000   9.00000

 Eigenmodes calculation started ...
 Eigenmodes calculation finished successfully.

 eigenvalues
    1.0000000
    4.0000000
    9.0000000


 eigenvectors
    1.00000    0.00000    0.00000
    0.00000    1.00000    0.00000
    0.00000    0.00000    1.00000

 symmetric matrix

matrix:
   1.00000   1.40000   2.10000
   1.40000   4.00000   4.20000
   2.10000   4.20000   9.00000

 Eigenmodes calculation started ...
 Eigenmodes calculation finished successfully.

 eigenvalues
   11.9658613
    0.3969251
    1.6372066


 eigenvectors
    0.22381    0.96484   -0.13780
    0.48505   -0.23290   -0.84290
    0.84536   -0.12182    0.52012

 dense matrix

matrix:
   0.46331  -0.28958  -0.38886
  -0.05445  -0.42509  -0.40991
   0.10098   0.43904   0.16454

 Eigenmodes calculation started ...
 Eigenmodes calculation finished successfully.

 eigenvalues
    0.4433555
 ( -0.12,  0.32)
 ( -0.12, -0.32)


 eigenvectors
   -0.98020 (  -0.08147,  -0.27070) (  -0.08147,   0.27070)
    0.13138 (   0.47288,  -0.45439) (   0.47288,   0.45439)
   -0.14814 (  -0.70000,   0.00000) (  -0.70000,  -0.00000)
3 Likes