Solving sparse underdetermined linear systems

Fortranners,

I’m wondering if anyone knows of a good Fortran library for solving a sparse underdetermined system of equations in the minimum-norm sense?

That is, solve Ax = b, where A is m x n, with m<n (so, more columns than rows). Return the solution that minimizes the norm of x.

I can solve these equations with the LAPACK routine dgels. That works very well, but is assuming a dense system. My system is large (say n around 30,000) and sparse (only a small precent of the elements are non-zero, mostly along the diagonal), and this routine is quite slow for those cases.

I’ve been experimenting with lsqr (sometimes works, sometimes doesn’t converge) and lusol (can’t seem to get to work, claims systems are singular that dgels solves just fine). I’m not a linear algebra guru though.

What else is out there?

1 Like

Maybe one of these could be of use?

2 Likes

Not a direct answer, but I just saw a talk about something somewhat related:

Might be some inspiration in there.

2 Likes

Oooo… qr_mumps looks interesting! I’ll try it out. Unfortunately looks like it has a complicated build process involving several other libs that I also have to figure out how to build and install. (like the pre-FPM bad old days).

the MKL one says " NOTE: The underdetermined systems of equations are not supported"

I’ve never solved such problems with sparse matrices, however it does look like SciPy provides both LSQR and LSMR.

The LSMR paper shows it’s based on the GMRES method, so I would bet it’s more robust than conjugate gradient for general real-world cases. It’s page is here, there are many implementations including Fortran, may well be worth a try.

3 Likes

There is the R package sparsenet: Fit Sparse Linear Regression Models via Nonconvex Optimization by Mazumder, Friedman and Hastie, with underlying Fortran (originally mortran) code sparsenet.f, with the algorithm explained in a paper.

Follow up: it does appear the qr_mumps is exactly what I want. But I can’t get it to work, at least on my Mac laptop. Opened a ticket here: Apple Silicon Segmentation fault (#2) · Issues · qr_mumps / qr_mumps · GitLab

I’m curious to know if anyone here has used this library?

@jacobwilliams, just a wild guess based on looking over the qr_mumps and StarPU documentation but you might try installing hwloc as suggested in the StarPU docs if you don’t already have it installed. Your error messages suggest its something to do with how StarPU schedules work on processors.

See

https://www.open-mpi.org/projects/hwloc/

Also

StarPU runtime is here

You might also look at the SVD package PROPACK. You would have to write your own code to solve the LS problem using SVD generated by PROPACK but thats just a few matrix multiplies. I recall reading somewhere that some folks suggested using SVD for underdetermined systems (I think thats what LAPACK does). The problem with SVD though is its high computational cost. It looks like PROPACK uses openMP so it might not be so bad.

PROPACK link:

http://sun.stanford.edu/~rmunk/PROPACK/

1 Like

I maintain the Arch Linux package and have it installed. I can test your code on x86_64 if it is easy to install

1 Like

Oh cool!

Yeah, if you are bored one day and want to try this example, that would be great!:

(note it’s the qrm branch). All this is doing is reading a sparse matrix (about 24000x28000) from a data file and computing a minimum norm solution. first with Lapack (dgels) and then with qr_mumps. The qr_mumps one is totally untested, but I based it on one of the examples, so hopefully it will work. the Lapack one takes a few minutes on my laptop.

You can compile with FPM: fpm run --profile release --flag "-lblas -llapack" but would need to add whatever extra flags you need to get qr_mumps to link.

Would really be interested in seeing the comparison.

Yeah I have StarPU compiled and hwloc installed. I’m not sure if any of the developers of any of these libraries have ever tried it on an Apple Silicon Mac, so it’s hard for me to say what the problem could be.

Thanks, I’ll look at PROPACK. But glancing at one of the pdfs, its seems to say that it requires m>=n, so that doesn’t work for my case.

Maybe you can find something of use in the book - Sparse Solutions of Underdetermined Linear Systems and Their Applications, 2021, SIAM. It’s supposed to contain 64 algorithms (in MATLAB). I only had a glance at the front matter (which is openly available).

1 Like

Interesting, I’ve never seen this book before.

They seen to be solving something slightly different though based on this from the intro:

The aim of the study is to find a sparse solution in the sense that the number of nonzero entries of the solution is the smallest.

Which isn’t really the same thing as a minimum norm solution of a sparse system. But I’ll look at it. Thanks!

Jacob, if you dig deep into the hwloc documention they mention it supporting Apple M1. There is an xml file that has what I guess is a “profile” for an M1 along with several others. Again, the error message you posted makes me think you problem is with StarPU and not qr_mumps. I didn’t look at the qr_mumps build files but you might have to build StarPU and hwloc outside of qr_mumps first. Also, I was looking at the source code and didn’t see any comment about restriction to overdetermined or square matrices. Still might be worth a try just to see if it works. Also, PROPACK looks like a good candidate for modernizing and adding FPM support. Might give that a go and see where it leads.

1 Like

Oh right, the preface repeats this,

linear system of equations consists of a known matrix A and a known vector b such that Ax=b for an unknown vector x, where A is of size m×n. When m=n and m > n, methods of solving for x are well known and are discussed in standard numerical analysis textbooks. However, when m < n, Ax=b is called an underdetermined linear system. Such a linear system has become a subject of research in the last 15 years as part of an extremely active study in the community of compressive sensing. Mathematically, an underdetermined linear system has many solutions. The aim of the study is to find a sparse solution in the sense that the number of nonzero entries of the solution is the smallest. [emphasis added] Although there exists a sparse solution, the difficulty is that it may take years to find such an exact solution using our current computer power.Therefore, we must look for an approximation of the sparse solution which can be found within reasonable time and tolerance.

An under-determined system has an infinite number of solutions. Judging by the description of compressed sensing on Wikipedia, a solution with a small number of non-zeros is different type of constraint from the minimization of the L_2-norm (the least squares solution), and is more suitable for some types of problems.

1 Like

Could linear programming strategies be adapted for optimised solutions to this problem ?

took a little bit longer because I mixed up MUMPS and QR_MUMPS. But installing QR_MUMPS is very straight forward:

cmake -B build -DARITH=d -DCMAKE_INSTALL_PREFIX=${PWD} -DBUILD_SHARED_LIBS=ON -DQRM_ORDERING_METIS=ON
cmake --build build --parallel --target install  

I have installed METIS as a package. After exporting the LD_LIBRARY_PATH,

fpm run --profile release --flag "-lblas -llapack -L/tmp/qr_mumps/lib -ldqrm -lqrm_common -I/tmp/qr_mumps/include"

I get

Project is up to date
 reading data/matrix_test.json
 check_solution_in_file
 b error:    6.3514977426607910E-013
 solve_with_qrm
 error:    4.6937255843139300E-012
 time:   0.25215055500000000      sec
 solve_with_lapack
 allocate dense matrix
 call dgels
 info =            0
 error:    4.1530253854489664E-012
 time:    634.22027651799999      sec
2 Likes

0.2 sec! Wow, thanks! I guess I’m going to try Linux. How many cores does the system you ran this on have?

Also did you not have to build/install StarPU?

I did not install StarPU and use a Intel® Core™ i7-8665U (4 cores with hyperthreading). I could not detect any traces of parallel execution, /usr/bin/time reports 99% use for the whole fpm comman.d

1 Like

Oh interesting! So you think this was just the serial execution? So the speed up wrt lapack is just from using the sparsity, rather than doing anything in parallel?