Concerning the evidence for what I claim to happen in my actual use case, I cannot really dump here the whole repository, but consider this (the whole body of grad_delta_replica
, you just have to trust me that so2nn_inlined
is placed in the same module and is identical to what I have shown you above).
function grad_delta_replica(a) result(dDelta)
real(8),dimension(:) :: a
complex(8),dimension(Nspin,Nspin,Norb,Norb,Ldelta,size(a)) :: dDelta,dDelta_ref
integer :: ispin,iorb,jorb,ibath
integer :: i,j,k,iw,counter,Nso
complex(8),dimension(Nspin*Norb,Nspin*Norb) :: Htmp,Hbasis_so,Href
real(8),dimension(Nspin*Norb) :: eigenH
complex(8),dimension(Nspin*Norb,Nspin*Norb,Ldelta) :: invHso,Haux
complex(8),dimension(Nspin,Nspin,Norb,Norb,Ldelta) :: invHnn,invHref
real(8),dimension(Nbath) :: dummy_Vbath
type(nsymm_vector),dimension(Nbath) :: dummy_lambda
!
Nso = Nspin*Norb
!
!Get Hs
counter = 0
do ibath=1,Nbath
allocate(dummy_lambda(ibath)%element(Nlambdas))
!
counter = counter + 1
dummy_vbath(ibath) = a(counter)
dummy_lambda(ibath)%element=a(counter+1:counter+Nlambdas)
counter=counter+Nlambdas
enddo
!
dDelta=zero
dDelta_ref=zero
counter=0
do ibath=1,Nbath
Htmp= nn2so_reshape( Hreplica_build(dummy_lambda(ibath)%element) ,Nspin,Norb)
!
!! OLD IMPLEMENTATION: loop over many inversions
!!
do iw=1,Ldelta
Haux(:,:,iw) = zeye(Nspin*Norb)*xi*Xdelta(iw) - Htmp
call inv(Haux(:,:,iw))
!invHnn(:,:,:,:,iw) = so2nn_reshape(Haux(:,:,iw),Nspin,Norb)
invHref(:,:,:,:,iw) = so2nn_reshape(Haux(:,:,iw),Nspin,Norb)
enddo
!!
!! NEW IMPLEMENTATION: prediagonalization and optimized matrix products
!
call eigh(Htmp,eigenH,jobz='V',uplo='U')
!
do concurrent(i=1:Nso, j=1:Nso, iw=1:Ldelta)
invHso(i,j,iw) = sum(Htmp(i,:)/(xi*Xdelta(iw)-eigenH(:))*conjg(Htmp(j,:)))
end do
!
!Nso 2 NN reshape (local scope, all iω at once)
invHnn = so2nn_inlined(invHso,Nspin,Norb,Ldelta)
!
if(any(abs(invHnn-invHref)>1d-18))then
call splot('invHnn.dat',Xdelta,invHnn)
call splot('invHref.dat',Xdelta,invHref)
error stop "invHnn =/= invHref"
endif
!
!Derivate_Vp
counter = counter + 1
dDelta(:,:,:,:,:,counter)=2d0*dummy_Vbath(ibath)*invHnn(:,:,:,:,:)
dDelta_ref(:,:,:,:,:,counter)=2d0*dummy_Vbath(ibath)*invHref(:,:,:,:,:)
if(any(abs(dDelta(:,:,:,:,:,counter)-dDelta_ref(:,:,:,:,:,counter))>1d-18))then
call splot('dDelta.dat',Xdelta,dDelta(:,:,:,:,:,counter))
call splot('dDelta_ref.dat',Xdelta,dDelta_ref(:,:,:,:,:,counter))
error stop "dDelta =/= dDelta_ref"
endif
print *, 'Derivative of Vp passed'
!
!Derivate_lambda_p
do k=1,Nlambdas
counter = counter + 1
Hbasis_so=nn2so_reshape(Hreplica_basis(k)%O,Nspin,Norb)
do concurrent(iw=1:Ldelta)
Htmp = matmul(matmul(invHso(:,:,iw) , Hbasis_so) , invHso(:,:,iw))
Href = matmul(matmul(Haux(:,:,iw) , Hbasis_so) , Haux(:,:,iw))
! THESE WOULD BE THE CORRECT CALLS TO ANOTHER RESHAPE FUNCTION
!dDelta(:,:,:,:,iw,counter)=so2nn_reshape(dummy_Vbath(ibath)**2*Htmp,Nspin,Norb)
!dDelta_ref(:,:,:,:,iw,counter)=so2nn_reshape(dummy_Vbath(ibath)**2*Href,Nspin,Norb)
enddo
if(any(abs(Htmp-Href)>1d-18))then
error stop "Htmp =/= Href"
endif
print *, 'Double matmul passed'
Htmp = dummy_Vbath(ibath)**2 * Htmp
Href = dummy_Vbath(ibath)**2 * Href
! THESE ARE THE TWO INCRIMINED LINES
dDelta(:,:,:,:,:,counter)=so2nn_inlined(Htmp,Nspin,Norb,Ldelta)
dDelta_ref(:,:,:,:,:,counter)=so2nn_inlined(Href,Nspin,Norb,Ldelta)
! if(any(abs(dDelta(:,:,:,:,:,counter)-dDelta_ref(:,:,:,:,:,counter))>1d-18))then
! call splot('dDelta.dat',Xdelta,dDelta(:,:,:,:,:,counter))
! call splot('dDelta_ref.dat',Xdelta,dDelta_ref(:,:,:,:,:,counter))
! error stop "dDelta =/= dDelta_ref" <-- IF UNCOMMENTED IT WOULD FAIL HERE
! endif
print *, 'Reshape passed'
! NEVERTHELESS IF I DON'T CHECK I GET TO THIS PRINT AND EXIT THE FUNCTION
enddo
!
enddo
end function grad_delta_replica
So as annoted in uppercase comments, with the above code I manage to get accross the whole function and exit. Then the result is nasty (I guess a lot of zeros here and there) and I catch a floating-point exception downstream (in the minimizer inner callgraph):
$ fpm @debug --target testFasterFit
minimize_krauth.f done.
optimize_cgfit_routines.f90 done.
minimize_sascha.f done.
VARS_GLOBAL.f90 done.
mwe.f90 done.
OPTIMIZATION.f90 done.
BATH_AUX.f90 done.
BATH_FIT.f90 done.
libCGfit.a done.
testFasterFit.f90 done.
testCGfit.f90 done.
mwe done.
testFasterFit done.
testCGfit done.
[100%] Project compiled successfully.
Reading bath from hamiltonian.restart
there are 9 lines in +hamiltonian.restart
#V_i Lambda_i1
1
-9.917759089942E-02 -2.000245670544E+00
2.390834374608E-01 -4.367278655013E-01
1.308364827565E-01 -5.601539875953E-02
4.908032834208E-02 -4.519910981637E-03
5.479599716106E-02 6.086421383198E-03
1.396752577662E-01 6.521317151957E-02
2.406750847783E-01 4.984563931245E-01
( 1.00, 0.00) ( 0.00, 0.00) ( 0.00, 0.00)
( 0.00, 0.00) ( 1.00, 0.00) ( 0.00, 0.00)
( 0.00, 0.00) ( 0.00, 0.00) ( 1.00, 0.00)
At line 637 of file test/testFasterFit.f90
Fortran runtime warning: An array temporary was created for argument 'y1' of procedure 'sreada1_rc'
read: weiss.in
read: weiss.in
read: weiss.in
REFERENCE FIT:
\Chi2 fit with CG-nr and CG-weight: 1 on: weiss
Fitted functions 9
CG: ftol updated to: 1.00E-05
CG: itmax updated to: 500
CG: istop update to: 0
Iter,F_n = 1 4.8236829832142269E-005
gradF: 1.1014166103377348E-005
-2.5504306323596377E-007
-7.9976204141721073E-005
-3.9973400030288891E-006
4.3119745248945053E-005
1.1395555172914915E-004
1.0004361834358329E-004
1.0435552752348357E-004
-1.0522610915076655E-004
1.5794686669033230E-005
-2.5433889783246183E-005
1.3068602217302934E-004
9.5620840808528795E-005
-1.5100085122048181E-005
A,B,ftol = 6.8187445330087951E-007 1.4372683502857203E-008 1.0000000000000001E-005
Converged with (|F_n-F_n-1|/1+|F_n|), ||a_n - a_n-1||^2/1+||a_n||^2 < ftol_: 6.8187445330087951E-007 1.4372683502857203E-008
chi^2|iter = 4.755492249E-05 | 1 <-- All Orbs, All Spins
#V_i Lambda_i1
1
-9.916657673332E-02 -2.000245925587E+00
2.390034612567E-01 -4.367318628413E-01
1.308796025017E-01 -5.590144320780E-02
4.918037196042E-02 -4.415555454114E-03
5.469077105191E-02 6.102216069867E-03
1.396498238764E-01 6.534385754174E-02
2.407707056191E-01 4.984412930394E-01
( 1.00, 0.00) ( 0.00, 0.00) ( 0.00, 0.00)
( 0.00, 0.00) ( 1.00, 0.00) ( 0.00, 0.00)
( 0.00, 0.00) ( 0.00, 0.00) ( 1.00, 0.00)
NEW FASTER FIT:
\Chi2 fit with CG-nr and CG-weight: 1 on: weiss
Fitted functions 9
CG: ftol updated to: 1.00E-05
CG: itmax updated to: 500
CG: istop update to: 0
Derivative of Vp passed
Double matmul passed
Reshape passed
Derivative of Vp passed
Double matmul passed
Reshape passed
Derivative of Vp passed
Double matmul passed
Reshape passed
Derivative of Vp passed
Double matmul passed
Reshape passed
Derivative of Vp passed
Double matmul passed
Reshape passed
Derivative of Vp passed
Double matmul passed
Reshape passed
Derivative of Vp passed
Double matmul passed
Reshape passed
Derivative of Vp passed
Double matmul passed
Reshape passed
Derivative of Vp passed
Double matmul passed
Reshape passed
Derivative of Vp passed
Double matmul passed
Reshape passed
Derivative of Vp passed
Double matmul passed
Reshape passed
Derivative of Vp passed
Double matmul passed
Reshape passed
Derivative of Vp passed
Double matmul passed
Reshape passed
Derivative of Vp passed
Double matmul passed
Reshape passed
Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
Backtrace for this error:
#0 0x7fece5e873af in ???
#1 0x4505e1 in __cgfit_routines_MOD_df1dim
at ./src/optimize_cgfit_routines.f90:96
#2 0x44e269 in __cgfit_routines_MOD_dbrent_
at ./src/optimize_cgfit_routines.f90:331
#3 0x451520 in __cgfit_routines_MOD_dlinmin
at ./src/optimize_cgfit_routines.f90:69
#4 0x435f71 in __optimize_minimize_MOD_fmin_cg_df
at ./src/OPTIMIZATION.f90:97
#5 0x47c720 in chi2_fitgf_replica
at ./src/BATH_FIT.f90:176
#6 0x480d66 in __bath_fit_MOD_chi2_fitgf_generic_normal
at ./src/BATH_FIT.f90:78
#7 0x42b6a9 in testdeltabuild
at test/testFasterFit.f90:646
#8 0x42bc89 in main
at test/testFasterFit.f90:573
<ERROR> Execution failed for object " testFasterFit "
<ERROR>*cmd_run*:stopping due to failed executions
STOP 1
As you can see from the stdout prints, I get across the “Derivative of Vp passed; Double matmul passed; Reshape passed” triad many times, until exiting the function, and catch a problem only later. Without the -ffpe-trap=invalid,overflow,zero
flag it would even go further until finishing the program, with very subtle numerical errors on the fitted coefficients.
So yes, I don’t really know why the MWE does a segmentation fault inside so2nn_inlined
and the whole program does not, but I guess at this point is kind of irrelevant.