Really curious what the baseline speed is (ie. the best you get from the C libraries) as I am wondering whether to continue resurrecting some code from long ago, as my minimal cleanup got some speedups and interesting results but has some issues.
So I added another loop to your program that calls an old function that I dusted off just enough to fire up. The encouraging
part was I got a speed-up with all three compilers I tried (top time is original code):
(
exec 2>&1
set -x
gfortran x1.f90
./a.out
./a.out
ifort x1.f90
./a.out
./a.out
nvfortran x1.f90
./a.out
./a.out
) >> $0
exit
+ gfortran x1.f90
+ ./a.out
time * : 1.1303 seconds
time * : 0.2492 seconds
+ ./a.out
time * : 1.1178 seconds
time * : 0.2483 seconds
2318 -0.55511151231257827E-15 0.79741218980722206 0.79741218980722262 0.79741218980722262
498194 0.55511151231257827E-15 0.87089963264350001 0.87089963264349946 0.87089963264349946
810741 -0.55511151231257827E-15 0.58151616177296606 0.58151616177296661 0.58151616177296661
+ ifort x1.f90
+ ./a.out
time * : 0.7095 seconds
time * : 0.1049 seconds
870474 -.5551115123125783E-15 .6825123899679394 .6825123899679399 .6825123899679399
+ ./a.out
time * : 0.6774 seconds
time * : 0.1067 seconds
870474 -.5551115123125783E-15 .6825123899679394 .6825123899679399 .6825123899679399
+ nvfortran x1.f90
+ ./a.out
time * : 0.3050 seconds
time * : 0.1965 seconds
+ ./a.out
time * : 0.3135 seconds
time * : 0.1951 seconds
The funny numbers are because I stored the results of both calculations and did a delta between them, and listed values with >2*epsilon(0.0d0) delta.
The code was so old it was written for Holleriths and needs further work if anyone finds that appealing enough to continue, but it showed a few problems and a few reasons the defaults might be so slow.
With all three compilers when I compared the string values to the float values they were dead-on; with mine I got a reasonable value but not dead on. So some of the speed difference may be the cost of accuracy.
I very odd thing is if I replace the asterisk format with “(g0)” in ifort and nvfortran (gfortran 10 does not support that) in the original it was twice as fast. I found that interesting that it was that large a difference, although I tried it because I had a hunch it might be a little faster.
Also note that the speeds from the various compilers vary significantly, showing there is room for improvement there, which if pursued would generate fimprovements for everyone not just JSON users.
So forgive the dusty deck, and note I don’t think this method will easily lend itself to getting dead-on values. Change the 2epsilon to 1epsilon to see the majority of the values are not “perfect” but satisfy most requirements; but even if not pursued here it might be useful for some readers with a need for speed:
module M_ascii
! W-A-R-N-I-N-G
! W-A-R-N-I-N-G : Dusty deck that needs work. For illustration only
private
public ator
contains
logical function ator(astr,val)
use iso_fortran_env, only: wp => real64, ip => int64
implicit none
!
! Convert ASCII-text to DP and return .TRUE. if OK
!
character(len=*),intent(in) :: astr
integer :: nc ! NUMBER OF CHARS
real(kind=wp) dfac, exp
real(kind=wp) :: val, v, ifac
integer :: i, mode, sexp, sval, a
integer,parameter :: uech=ichar('E'), lech=ichar('e')
integer,parameter :: udee=ichar('D'), ldee=ichar('d')
integer,parameter :: plsign=ichar('+'), mnsign=ichar('-'), decim=ichar('.')
integer,parameter :: u0=ichar('0'), u9=ichar('9'), aspace=ichar(' ')
ator = .false.
val = 0.0d0
exp = 0.0d0
sval = 1
sexp = 1
ifac = 10
dfac = 1.0d0
!
nc=len_trim(astr)
mode = 0
do i = 1, nc
a=ichar(astr(i:i))
! value part of the number
if(mode.eq.0)then
select case(a)
case(u0:u9)
v = real(a-u0,kind=wp)
if (ifac.ne.1) then
val = val*ifac
else
v = v / dfac
dfac = dfac * 10.0d0
endif
val = val + v
case(plsign)
sval = 1
case(mnsign)
sval = -1
case(decim)
ifac = 1
dfac = 10.0d0
case(uech,lech,udee,ldee)
mode = 1
case(aspace)
case default
val = 0.0d0
!return
end select
else
select case(a)
! exponent
case(u0:u9)
exp = exp*10 + real(a-u0,kind=wp)
case(plsign)
sexp = 1
case(mnsign)
sexp = -1
case default
val = 0.0d0
!return
case(aspace)
end select
endif
enddo
val = val * sval
if (exp.ne.0.0d0) val = val * (10.0d0**(exp*sexp))
ator = .true.
end function ator
end module m_ascii
program main
use M_ascii, only : ator
use iso_fortran_env, only: wp => real64, ip => int64
implicit none
integer,parameter :: n = 1000000 !! number of values
real(wp) :: rval_ator(n), rval_read(n), rval(n)
integer :: ierr, i
integer(kind=ip) :: start, finish, count_rate
logical :: lerr
character(len=30),allocatable :: strings(:)
! create a list of values to parse
allocate(strings(n))
do i = 1, n
call random_number(rval(i))
write(strings(i), '(g0)') rval(i)
enddo
! use internal read
call system_clock(start, count_rate)
do i = 1, n
read(strings(i),fmt=*,iostat=ierr) rval_read(i)
enddo
call system_clock(finish)
write(*,'(a30,1x,f7.4,1x,a)') 'time * : ', (finish-start)/real(count_rate,wp), ' seconds'
! use ator()
call system_clock(start)
do i = 1, n
lerr=ator(strings(i),rval_ator(i))
enddo
call system_clock(finish)
write(*,'(a30,1x,f7.4,1x,a)') 'time * : ', (finish-start)/real(count_rate,wp), ' seconds'
! compare results
do i = 1, n
if(abs(rval_ator(i)-rval_read(i)).gt.2*epsilon(rval_ator(i))) then
write(*,'(i10,1x,*(g0,1x))')i,rval_ator(i)-rval_read(i),rval_ator(i),rval_read(i),strings(i)
endif
enddo
!do i = 1, n
! if(rval(i)-rval_read(i).ne.0.0d0) then
! write(*,'(i10,1x,*(g0,1x))')i,rval_ator(i)-rval_read(i),rval_ator(i),rval_read(i),strings(i)
! endif
!enddo
!do i = 1, n
! if(rval(i)-rval_ator(i).ne.0.0d0) then
! write(*,'(i10,1x,*(g0,1x))')i,rval_ator(i)-rval_read(i),rval_ator(i),rval_read(i),strings(i)
! endif
!enddo
end program main