Lapack dtrtrs doubts

I’ve stumbled on a weird result of Lapack. Since the documentation states that “Wrong results from LAPACK routines are most often caused by incorrect usage” I am assuming that the problem lies in-between the keyboard and the chair (aka nel soramanego, for the Italian audience). I’m here using stdlib lapack but the problem is there with standard lapack as well.

The code is the following

PROGRAM test_dtrtrs
  USE stdlib_kinds
  USE stdlib_linalg_lapack

  implicit none

  integer :: n = 25
  real(dp), allocatable, dimension(:,:) :: A, L, U
  real(dp), allocatable, dimension(:)   :: ones, x
  
  integer :: i,j 
  integer :: ierr 
  integer :: info
  
  allocate( A(n,n), L(n,n), U(n,n), ones(n), x(n), stat=ierr)
  
  ! construct lower triangular factor
  L = 0._dp
  do j = 1, n
     do i = j, n
        call random_number(L(i,j))
     end do
  end do
  
  ! construct upper triangular factor
  U = transpose(L)
  
  ! construct matrix A = L*L^T (that is thus SPD)
  A = matmul( L, U )
  
  ! test array
  ones = 1._dp

  ! generate the rhs with known solution  
  x = matmul( A, ones )

  ! forward solve
  call stdlib_dtrtrs( 'L', 'N', 'N', n, 1, L, n, x, n, info )
  ! backward solve
  call stdlib_dtrtrs( 'L', 'T', 'N', n, 1, L, n, x, n, info )
  
  ! This should be all ones 
  print *, x

END PROGRAM test_dtrtrs

This code produces as expected an array of ones only for small ns. For n=50 the result is never correct. Any Idea of what is going on and where I am making my mistakes?

The idea is that I can generate a “random” SPD matrix to test a pCG, and this kernel would be a preconditioner application (that usually is triangular for CG, to preserve SPD). Since my very easy test feeds the CG with the matrix A=LL^T and K=L for the preconditioner (applied as s=(LL^T)^{-1}w) the CG should do very few iterations since my preconditioner is the exact inverse of the coefficient matrix. Needless to say that it doesn’t work that way if the procedure is incorrect.

Thanks

Edit: corrected the call, tested, still diverging
Edit2: compiled with gfortran -I/...stdlib -O3 -g -fbounds-check -Wall -Wno-uninitialized -framework Accelerate -L/...stdlib test_trisolve.F90 -lfortran_stdlib -o main

The parameter ‘nrhs’ (the number of right-hand sides) should be set to 1 in the call to ‘dtrtrs’.

1 Like

This is true, however it does not solve the problem. I wonder if it is just a problem of conditioning of the matrix.

1 Like

What optimization levels and other compiler options did you use. Also, what compiler. Unfortunately, I’ve found that modern compilers can be sensitive to the optimization level you use. Try a run with no optimization. It will be a lot slower but if it work you at least have a clue that its not the code.

My compilation command is

gfortran -I/Users/Francesco/stdlib_fortran/include/fortran_stdlib/GNU-12.2.0 -O3 -g -fbounds-check -Wall -Wno-uninitialized -framework Accelerate -L/Users/Francesco/stdlib_fortran/stdlib/build/src test_trisolve.F90 -lfortran_stdlib -o main

I have tried without specifications as in

gfortran -I/Users/Francesco/stdlib_fortran/include/fortran_stdlib/GNU-12.2.0  -L/Users/Francesco/stdlib_fortran/stdlib/build/src test_trisolve.F90 -lfortran_stdlib -o main

but the problem persists.

The 2-norm condition number given by

\kappa(A)=\frac{\sigma_{\rm max}(A)}{\sigma_{\rm min}(A)}

and calculated with

call dgesvd('N','N',n,n,a,n,s,u,n,u,n,work,5*n,info)
print *,s(1)/s(n)

is large for large n (\sim 10^{18} for n=100).

You may have to think of a better way of generating random matrices for which A=LL^T is not nearly singular.

3 Likes

I didn’t think that I could have checked the conditioning number with 2 line of codes. If I make it better conditioned (by adding a shift to the diagonal of L in the construction of A) I get the expected result up to \kappa(A)\sim10^{16}.

In particular, for n=1000 with
L(j,j) = 1._dp + 3._dp*L(j,j) I get \kappa(A)\sim10^{16},
L(j,j) = 1.5_dp + 3._dp*L(j,j) I get \kappa(A)\sim10^{12},
L(j,j) = 2._dp + 3._dp*L(j,j) I get \kappa(A)\sim10^{10},
L(j,j) = 3._dp + 3._dp*L(j,j) I get \kappa(A)\sim10^8,
L(j,j) = 4._dp + 3._dp*L(j,j) I get \kappa(A)\sim10^7.

For the smaller values (\kappa(A)\sim10^7), I also get the nice convergence of the CG in less than n iterations (tol=10^{-6}). Now, the pCG with that backward-forward substitution converges in one iteration obviously.

Thank you very much @jkd2022.

(But I will still accept suggestions about how to create “random” matrices to test solvers)

Those two lines of code are doing a large amount of computations. SVD is an expensive O(N**3) operation.

One suggestion is instead of setting all the elements of the matrix to a random number r in [0,1), set x=2.0_wp * r - 1.0_wp, which gives random numbers in [-1,1). In 2-D space, a vector of r values is only 1/4 of the square, while a vector of x values is in the whole square. In 3-D, it is 1/8 of the cube. In 4-D, it is 1/16 of the space. In 100-D it is 1/2**100. In N-D, it is 1/2**N. So in higher dimensions, the columns of the matrix are vectors that are all in a smaller and smaller subspace of the full vector space, leading to numerical linear dependence.

Your diagonal shift is another way to address the linear dependence problem, and the two ideas can be used together. If you want (or need) diagonally dominant matrices, then the diagonal shift will give you that. If you don’t want diagonally dominant matrices, then don’t do the diagonal shift, or do a shift to get linear independence, then transform with a suitable orthogonal matrix to purposely destroy the diagonal dominance.

One other comment. This inner loop can be replaced with

call random_number(L(j:n,j))

To simplify the code a little.

I like to start with a non-singular matrix then randomize it a bit … then a bit more, until it breaks. Large positive diagonal values and smaller off-diagonal entries is a simple example.

There are plenty of test matrices around. For example LAPACK Working Note 9
A Test Matrix Generation Suite
and SuiteSparse Matrix Collection

1 Like

Yes. LAPACK has routines ***CON - for example DGECON for a Double precision GEneral matrix - to estimate the (reciprocal of the) condition number of an LU factored matrix. This is inexpensive O(N^2) if the matrix is already factored and a worthwhile sanity check. I can’t count the number of times I have generated a numerically singular matrix due to a bug, or inadequate checking of the inputs.

See chapter 15 of Higham, Nicholas. Accuracy and Stability of Numerical Algorithms. 2nd ed. Philadelphia: Society for Industrial and Applied Mathematics, 2002.

2 Likes

There are some methods where the condition number, or a good approximation, can be computed with little extra effort. For these cases, the condition number should always be computed, or perhaps optionally computed based on the presence of the appropriate optional argument. There are other situations where it is relatively expensive, and it should not be computed by default, but only when specifically requested by the caller (and the documentation should warn the programmer of its cost). The same arguments hold for the closely related eigenvalue and eigenvector error bounds.

2 Likes

Just out of curiosity, I wonder why this kind of random matrix (e.g. like the above LL^T case) becomes so quickly “ill-conditioned” (= the ratio of largest/smallest eigenvalues becomes extremely small as n increases). Is there possibly some intuitive or geometrical (?) explanations…?

The easiest picture to understand linear system conditioning number is thinking at the geometrical interpretation of a linear system: you are looking for an intersection between linear entities. Consider a 2\times 2 linear system defined as

\left\lbrace \begin{align} ax + by = c \\ dx + ey = g \end{align}\right.

what you are doing here is looking for the intersection between the two lines y = \frac{c}{b} - \frac{a}{b}x and y = \frac{g}{e} - \frac{d}{e}x in the cartesian plane. The two terms at the right hand side, c and g are the only two things you know about the lines (their point of intersection with the y-axis).


Consider two cases, one with two differently skewed lines (left) and another one with lines that are close to parallel(right). Note that as we are talking about the slope of these two lines, we are talking only about information contained in the coefficient matrix of the system (i.e. a/b and d/e). Now, when you perturb your right hand side of a small quantity \delta (in the picture, g^* = g + \delta) you are translating the line of said \delta, so the intersection point moves. In the left case, the well-contidioned system, the displacement is of the order of \delta itself. In the right case, the ill-conditioned system, the lines are almost parallel so with a little \delta your intersection point could move very far from the initial intersection point. So, the conditioning number measures how difficult it will be for you to have a correct solution of the problem if you admit a small perturbation in your data (which means working with real data subjected to errors) as indeed you can define the condition number starting from

A(x+\delta x) = b + \delta b

that leads to

\frac{||\delta x||}{||x||} \leq \kappa(A) \frac{||\delta b||}{||b||}.

Pedantic introduction aside, my interpretation is that when you draw numbers with a specific distribution and in a bounded interval (such as using random_number()) what you’re doing is actually creating a set of almost parallel linear entities, and then asking what is the intersection point.

PS: picture taken from this set of notes by professor Mario Putti (the Great)

2 Likes

Look back three posts, I explained it there. It is a useful exercise to generate random matrices with elements in [0,1) and compare the condition numbers to those from matrices with elements [-1,1). Also, the dimension N is important.

1 Like

The graphical explanation in 2D is nice, but the interpretation in >2D is not so easy to get. In particular why do you have more chance at the end to create almost parallel hyperplanes when the number of dimensions increase?