Problem with Intel MKL Pardiso_64

phangelder · December 23, 2024, 6:06am

I am replacing our old frontal solver with Intel MKL pardiso direct solver.
The verification problem is about the P-Delta effects of a cantilevered column.
ref : Wilson, E. L. (2004). Static and Dynamic Analysis of Structures (4th ed., pp. 120-121). Berkeley, CA: Computers and Structures, Inc.
The fully fixed cantilevered-column has the following properties:
Length: L = 10m
Cross-section: 0.1m x 0.1m square
Concrete modulus: E = 30GPa
Axial load: Fv = 4kN (compression)
Lateral load: F = 0.045kN
I use two load steps. Load step 1 yield coorect result. Load step 2 no.
The axial force becomes -5.9998378 using Pardiso linear solver compared to frontal solver of -3.99999.

Ke
0.2999970E+05
0.2999690E-37 0.3000009E+01
-0.9999007E+02 -0.9999073E-40 0.3333302E+01
-0.1800025E-45 0.5000035E+02 0.1499999E-37 0.5066242E+07
-0.5000017E+02 0.1800025E-45 -0.1499989E+05 -0.9493410E-34 0.9999971E+08
-0.1500004E-37 0.1499994E+05 0.0000000E+00 0.3164495E+06 0.3164516E-36 0.9999902E+08
-0.2999970E+05 -0.2999690E-37 0.9999007E+02 0.1800025E-45 0.5000017E+02 0.1500004E-37 0.2999970E+05
-0.2999690E-37 -0.3000009E+01 0.9999073E-40 -0.5000035E+02 -0.1800025E-45 -0.1499994E+05 0.2999690E-37 0.3000009E+01
0.9999007E+02 0.9999073E-40 -0.3333302E+01 -0.1499999E-37 0.1499989E+05 0.0000000E+00 -0.9999007E+02 -0.9999073E-40 0.3333302E+01
-0.1800025E-45 0.5000035E+02 0.1499999E-37 -0.5064576E+07 -0.5506480E-34 0.1835507E+06 0.1800025E-45 -0.5000035E+02 -0.1499999E-37 0.5066242E+07
-0.5000017E+02 0.1800025E-45 -0.1499989E+05 -0.5506480E-34 0.4999986E+08 0.1835519E-36 0.5000017E+02 -0.1800025E-45 0.1499989E+05 -0.9493410E-34 0.9999971E+08
-0.1500004E-37 0.1499994E+05 0.0000000E+00 0.1835507E+06 0.1835519E-36 0.4999942E+08 0.1500004E-37 -0.1499994E+05 0.0000000E+00 0.3164495E+06 0.3164516E-36 0.9999902E+08
Kg
-0.2000007E+00
0.3999987E-43 -0.2400003E+00
-0.1333334E-03 -0.1333343E-45 -0.2399998E+00
-0.8333293E-01 -0.7500035E+00 -0.2499961E+02 -0.1192809E+03
0.6666701E+00 0.2449270E-41 0.1999989E+03 0.8333241E+05 -0.2666664E+07
0.2777794E-03 -0.1999986E+03 0.8333293E-01 -0.9166355E+04 -0.2777777E+03 -0.2666632E+07
0.2000007E+00 -0.3999987E-43 0.1333334E-03 0.8333293E-01 -0.6666701E+00 -0.2777794E-03 -0.2000007E+00
-0.3999987E-43 0.2400003E+00 0.1333343E-45 0.7500035E+00 -0.2449270E-41 0.1999986E+03 0.3999987E-43 -0.2400003E+00
0.1333334E-03 0.1333343E-45 0.2399998E+00 0.2499961E+02 -0.1999989E+03 -0.8333293E-01 -0.1333334E-03 -0.1333343E-45 -0.2399998E+00
-0.1164148E-15 -0.6666701E+00 -0.3492405E-13 0.9428067E+02 0.4166620E+05 0.2083040E+04 0.1164148E-15 0.6666701E+00 0.3492405E-13 -0.1165031E+03
0.6666701E+00 -0.2532604E-41 0.1999989E+03 0.4166620E+05 0.6666659E+06 -0.1388888E+03 -0.6666701E+00 0.2532604E-41 -0.1999989E+03 -0.4166620E+05 -0.2666664E+07
0.3880533E-18 -0.1999989E+03 0.1164148E-15 0.2083040E+04 -0.1388888E+03 0.6666594E+06 -0.3880533E-18 0.1999989E+03 -0.1164148E-15 -0.8749696E+04 0.1388888E+03 -0.2666635E+07

The parameters iparm
iparm( 1) = 1
iparm( 2) = 2
iparm( 4) = 0
iparm( 5) = 0
iparm( 6) = 0
iparm( 8) = 5
iparm(10) = 1
iparm(11) = 1
iparm(13) = 1
iparm(14) = 0
iparm(18) = -1
iparm(19) = -1
iparm(20) = 0
iparm(21) = 0
iparm(22) = 1
iparm(27) = 1

Appreciate advice from some experts

ivanpribec · December 23, 2024, 9:20am

This will be hard to advise on without a minimal working example.

I annotated the non-default iparm settings you are using,

iparm( 1) = 1 ! Use default values. (0 - use default values, 1 - user must supply all values)
iparm( 2) = 2 ! Fill-in reducing ordering for the input matrix.
iparm( 4) = 0
iparm( 5) = 0
iparm( 6) = 0
iparm( 8) = 5 ! Iterative refinement step.
iparm(10) = 1 ! Pivoting perturbation.
iparm(11) = 1 ! Scaling vectors.
iparm(13) = 1 ! Improved accuracy using (non-) symmetric weighted matching.
iparm(14) = 0
iparm(18) = -1 ! Report the number of non-zero elements in the factors.
iparm(19) = -1 ! Report number of floating point operations (in 10^6 floating point operations) that are necessary to factor the matrix A.
iparm(20) = 0
iparm(21) = 0
iparm(22) = 1 ! Inertia: number of positive eigenvalues.
iparm(27) = 1 ! Matrix checker.

If you are using iparm(1) = 1, then you should set all values iparm(2) to iparm(64). So perhaps before you change the individual settings, you should do:

! Default everything
iparm(1:64) = 0

! Change individual settings
iparm(1) = 1
iparm(2) = 2
...

Otherwise, I would carefully review the solver manual, make sure the matrix storage is correct (iparm(27) = 1 checks the storage is consistent), and that the input and output vectors x and b are being used correctly (i.e. is the right-hand side b overwritten, or is the solution returned in a separate vector).

Since you want to solve for multiple load vectors (right-hand sides), you need to make sure the phases of the solver are configured correctly. Probably you’d start with phase = 12 (analysis, numerical factorization), followed by repeated phase = 33 (solve, iterative refinement) for each load step.

phangelder · December 23, 2024, 9:54am

Create the A (tangential) Matrix by adding Ke + Kg together. The RHS load vector(B Matrix) is {0,0,0,-4000,0,45,0,0,0}. Solving Ax = B where x is the displacement vector. Assuming the first node is fixed. x={0,0,0,0,0,0, u2,v2,w2, Rot-x, Rot-y, Rot-z}. Solve for the 6 unknowns. The expected values are {-0.301 -0.000 -85.928 0.62961E-04 0.13004E-01 -0.18928E-06}.
Using pardiso_sym(NEQN, NNZ, NRHS, aMatx, ia, ja, bMatx, asdis) and the above iparm setting, the result obtained are {-0.548 -0.000 -124.284 0.155459E-03 0.188982E-01 -0.519404E-06}. Only One RHS

ivanpribec · December 23, 2024, 10:47am

Maybe I’m not used to the terminology, but I don’t understand what are your DoFs. The matrices shown are n = 12, but your load vector is length 9, and your solution vector is length 6. I assume this is because you are eliminating the Dirichlet nodes like described here: finite element - FEM: which is the correct way to impose Dirichlet B.C - Computational Science Stack Exchange

I tried to write a MWE, but I can’t get it to match your description.

program mwe
use, intrinsic :: iso_fortran_env, only: stdout => output_unit
implicit none

integer, parameter :: dp = kind(1.0d0)

real(dp), allocatable :: Ke(:,:), Kg(:,:), x(:), b(:), b_expected(:)
integer :: n, k

n = 12
allocate(Ke(n,n),Kg(n,n),x(n),b(n))
Ke = 0
Kg = 0

call readltm("Ke.dat",n,Ke)
call readltm("Kg.dat",n,Kg)

call printltm(stdout,n,Ke,"Ke = ")
call printltm(stdout,n,Kg,"Kg = ")

! The vector of unknowns x contains
!      {0,0,0,0,0,0, u2,v2,w2, Rot-x, Rot-y, Rot-z}
x(1:6) = 0

! from frontal solver (?)
x(7:12) = [-0.301, -0.000, -85.928, 0.62961e-4, 0.13004e-1, -0.18928e-6]
! from pardiso (?)
!x(7:12) = [0.548, -0.000, -124.284, 0.155459e-03, 0.188982e-01, -0.519404e-06]

! The load vector
allocate(b_expected(9))
b_expected = [0.0,0.0,0.0,-4000.0,0.0,45.0,0.0,0.0,0.0]

b = matmul(Ke + Kg,x)
print '(/,2A12)', "b(4:12)", "b_expected"

do k = 1, 9
    print '(2ES14.5)', b(3 + k), b_expected(k)
end do

contains

    subroutine printltm(unit,n,A,str)
        integer, intent(in) :: unit, n
        real(dp), intent(out) :: A(n,n)
        character(*), intent(in), optional :: str
        integer :: i, j
        if (present(str)) write(unit,'(A)') str
        do i = 1, n
            write(unit,'(*(ES14.4))') (A(i,j),j=1,i)
        end do
    end subroutine

    subroutine readltm(filename,n,A)
        character(*), intent(in) :: filename
        integer, intent(in) :: n
        real(dp), intent(out) :: A(n,n)
        integer :: unit, i, j
        open(newunit=unit,file=filename,status='old')
        do i = 1, n
            read(unit,*) (A(i,j),j=1,i)
        end do
        close(unit)
    end subroutine

end program

Edit: I forgot the matrix Ke + Kg is symmetric, so the multiplication with matmul is only half the matrix.

phangelder · December 23, 2024, 10:53am

sorry load vector is {0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -4000.0, 0.0, 45.0, 0.0, 0.0, 0.0} solution length is
{0.0, 0.0, 0.0, 0.0, 0.0, 0.0, u2,v2,w2, Rot-x, Rot-y, Rot-z} both 12 x 1 vector

ivanpribec · December 23, 2024, 11:27am

Your matrices Ke and Kg are lower-triangular. Is the global matrix symmetric and you are only storing the lower triangle?

phangelder · December 23, 2024, 11:49am

yes. if you need full matrix of ke, kg and kt=(ke+kg). i can post them here.

JohnCampbell · December 23, 2024, 12:02pm

If you have an applied (vertical) load of 4 kN and the original solver gives an axial force of 4 kN, I can’t see how the new solver gives an axial force of 6 kN, unless there has been a significant change in the interpretation of P-delta effects.
Equilibrium would give the vertical axial force should remain as 4 kN, unless there has been a significant change to the axial force direction, which was not present in the old solution.

Alternatively, as 6 kN = 2 kN + 4 kN; is your second result case the sum of case 1: half_load + case 2:full_load ?

ivanpribec · December 23, 2024, 1:57pm

If I try to solve using Cholesky factorization (DPOSV, assuming A = Ke + Kg is symmetric positive definite) it fails with info = 7. If I create the full (symmetric) matrix and try to solve with LU factorization (DGESV) it fails with info = 10.

If I use DSYSV I get info = 0,

    call dsysv('L',n,1,A,n,ipiv,b,n,wrk,lwrk,info)
    if (info /= 0) then
        print *, "DSYSV failed with info = ", info
        stop
    end if

but the solution doesn’t match what is expected.

Here is the whole program if you want to try it out:

Click here

! main.F90 --
!    Test driver for a discrete cantilever problem
!
program main
use, intrinsic :: iso_fortran_env, only: stdout => output_unit
implicit none

integer, parameter :: dp = kind(1.0d0)

real(dp), allocatable :: Ke(:,:), Kg(:,:), x(:), b(:), b_expected(:)
integer :: n, k

n = 12
allocate(Ke(n,n),Kg(n,n),x(n),b(n))
Ke = 0
Kg = 0

call readltm("Ke.dat",n,Ke)
call readltm("Kg.dat",n,Kg)

call printltm(stdout,n,Ke,"Ke = ")
!print *, Ke

call printltm(stdout,n,Kg,"Kg = ")
!print *, Kg

! The vector of unknowns x contains
! {0,0,0,0,0,0, u2,v2,w2, Rot-x, Rot-y, Rot-z}

x(1:6) = 0
! from frontal solver (?)
x(7:12) = [-0.301, -0.000, -85.928, 0.62961e-4, 0.13004e-1, -0.18928e-6]

! from pardiso (?)
!x(7:12) = [0.548d0, -0.000d0, -124.284d0, 0.155459d-03, 0.188982d-01, -0.519404d-06]

! The load vector
allocate(b_expected(n))
b_expected(1:6) = 0
b_expected(7:12) = [-4000,0,45,0,0,0]

b = b_expected
solve: block
    real(dp) :: A(n,n), d(n), wrk(n)
    integer :: info, ipiv(n), i, j, lwrk
    external :: dgesv, dposv, dsysv
    A = Ke + Kg
#ifdef FULL
    ! Extract the diagonal
    do i = 1, n
        d(i) = A(i,i)
    end do
    ! A := A + transpose(A) - diag(A)
    A = A + transpose(A)
    do i = 1, n
        A(i,i) = A(i,i) - d(i)
    end do
    ! Inspect the matrix is symmetric
    call printgem(stdout,n,A,"A = ")
    ! Solve using LU factorization
    call dgesv(n,1,A,n,ipiv,b,n,info)
    if (info /= 0) then
        print *, "DGESV failed with info = ", info
        stop
    end if
    print *, "residual norm = ", norm2(matmul(A,b) - b_expected)
#else
!    call dtrtrs('L','N','N',n,1,A,n,b,n,info)
    call dposv('L',n,1,A,n,b,n,info)
!    lwrk = n
!    call dsysv('L',n,1,A,n,ipiv,b,n,wrk,lwrk,info)
    if (info /= 0) then
        print *, "DPOSV failed with info = ", info
        stop
    end if
#endif
end block solve

write(*,'(/,2A14)') "x (lapack)", "x (phan)"
do k = 1, 12
    print '(2ES14.5)', b(k), x(k)
end do

contains

    subroutine printltm(unit,n,A,str)
        integer, intent(in) :: unit, n
        real(dp), intent(out) :: A(n,n)
        character(*), intent(in), optional :: str
        integer :: i, j
        if (present(str)) write(unit,'(A)') str
        do i = 1, n
            write(unit,'(*(ES14.4))') (A(i,j),j=1,i)
        end do
    end subroutine

    subroutine printgem(unit,n,A,str)
        integer, intent(in) :: unit, n
        real(dp), intent(out) :: A(n,n)
        character(*), intent(in), optional :: str
        integer :: i, j
        if (present(str)) write(unit,'(A)') str
        do i = 1, n
            write(unit,'(*(ES14.4))') (A(i,j),j=1,n)
        end do
    end subroutine

    subroutine readltm(filename,n,A)
        character(*), intent(in) :: filename
        integer, intent(in) :: n
        real(dp), intent(out) :: A(n,n)
        integer :: unit, i, j
        open(newunit=unit,file=filename,status='old')
        do i = 1, n
            read(unit,*) (A(i,j),j=1,i)
        end do
        close(unit)
    end subroutine

end program

phangelder · December 24, 2024, 12:42am

Suggest you add a large number to the diagonal term of the first six equations (prescribed displacements and rotations all zero) and solve for solution.

JohnCampbell · December 24, 2024, 2:36am

Perhaps you should describe how you have applied boundary conditions to your equations ?

phangelder · December 24, 2024, 3:24am

As per my previous reply. Add a large value to the diagonal of the equations as applied boundary conditions.
Solve Ax=B. Use the result x to find the reactions and internal member forces. After first load step, the result x is always higher than expected.

JohnCampbell · December 24, 2024, 1:36pm

When solving K.x=f, you can first eliminate the equations of known x value.
This reduction produces an Ax=B that is better behaved that Pardiso_64 can solve.
To recalculate the forces, you can calculate f = sum (Ke.x) for all x values.

ivanpribec · December 24, 2024, 2:55pm

This chapter from another of E. Wilson’s books appears related to the discussion: Boundary Conditions and General Constraints (full book available here: Three Dimensional Static and Dynamic Analysis of Structures). This is a different book than the one mentioned in the original post: Static and Dynamic Analysis of Structures | Computers and Structures, Inc..

I found a description of the cantilever problem here: https://web.wiki.csiamerica.com/wiki/spaces/tp/pages/1803067/P-Delta+effect+for+a+cantilevered+column

phangelder · December 25, 2024, 2:36am

any suggestion on how to eliminate the known x values from the sparse matrix A and B knowing IA and JA?

phangelder · December 25, 2024, 2:38am

I guess you may already aware that the problem is how to set up pardiso to solve the equation Ax=B where A is the tangential stiffness matrix of which Kg becomes significant.

ivanpribec · December 25, 2024, 8:00am

I found Wolfgang Bangerth’s slides helpful the last time I did this: https://www.math.colostate.edu/~bangerth/videos/676/slides.21.6.pdf (pdf, 340 kB)

phangelder · December 25, 2024, 9:02am

let me study first. The issue is when the negative diagonal term of Kg is almost of the same magnitude of Ke. How to get Pardiso to produce the correct result?

JohnCampbell · December 28, 2024, 9:39am

How valid are the P-delta assumptions when the negative diagonal term of Kg is almost of the same magnitude of Ke ?
I didn’t think the assumptions extended to instability.

phangelder · December 29, 2024, 2:59am

There are several techniques (critical load, arc length etc…) to solve load instability problem. I would like to explore solving this type of problem with Pardiso. Instability is a priori unknown for my case.

Topic		Replies	Views
Code for Intel ODE Solver - dodesol	7	821	August 15, 2023
Stiffness matrix inversion: discrepancy to Matlab	4	766	November 7, 2020
Comparative study of Fortran compilers, MKL, and FFTW	0	212	September 5, 2024
Molecular Dynamics Equilibration Step Homework	12	417	October 27, 2024
Performance, C vs. Fortran	25	4246	July 7, 2021

Problem with Intel MKL Pardiso_64

Related topics