Faster string to double

hkvzjal · January 13, 2023, 8:00pm

I just realized that if we want to do a more fair and broad benchmark of the methods, we are missing one test for the fortran intrinsic read:

If I include the array version to the first bench

call system_clock(start, count_rate)
read(strs(:),fmt=*,iostat=ierr) rval(:)
call system_clock(finish)
write(*,'(A30,1X,F7.4,1X,A)') 'Read: time consumed =', &
     (finish-start)/real(count_rate,dp), ' seconds (array)'

I obtain the following results:

--------------------------------------------------------------------------------
BLOCK 1: formatted read toward string to double
         Read: time consumed =  0.6180  seconds
         Read: time consumed =  0.2570  seconds (array)
A = Max |rval(:)-rref(:)| = 5.551E-17,   B = epsilon(1.0d0) = 2.220E-16,   B-A =  1.665E-16

It is still slower than the last version of the str2real function but I think it is just fair to compare the loop and array use-cases implementations with its equivalents.

tqviet · January 14, 2023, 4:26am

Thank you, @ hkvzjal

I ran the test with two compilers:

(a) GNU Fortran (Debian 8.3.0-6) 8.3.0
(b) ifort (IFORT) 2021.8.0 20221119

in computer with CPU: Intel(R) Core™ i7-9700F CPU @ 3.00GHz

My results given above are the one with the compiler (a) because the one with (b) is a bit worse. I still do not know the reason why the one with (b) is worse than that with (a). I also do not know why your version code (with findloc) shows better performance.

hkvzjal · January 17, 2023, 10:06am

Hello guys,

So I worked a bit more on this topic and wanted to share my latest results. In order to limit this post blowing up with all the code I forked @Carltoffel 's project so that you can find the latest version here

In the project you will find the latest str2real function discussed here named as ‘str2real_ch’ in the module str2num_m.f90 . and the new ones: ‘str2real’ and ‘str2real_p’. These two are basically the same function (I’ll explain the reason for both, but would like to bring back to just one)

In the original implementation we could work on an array like this:

rval(:) = str2real ( strs(:) )

But actually, I have found that it is way faster when reading an ASCII file to load the whole file in memory in a single big string and then roll it out, rather than just jumping line by line. So, I needed a way of stream-lining through the chain. And the elemental definition was blocking me from doing it properly. in the ‘str2real_p’ (for pointer) enables to do it by simply:

use str2num_m, only: str2real_p !< pointer version of the str2real function
character(:),allocatable,target :: strs_seq !< Original string
character(len=:), pointer :: ps !< Working pointer
real(8), allocatable :: r(:)
...
ps => strs_seq(1:)
do i = 1, n
   rval(i) = str2real_p ( ps ) !< the pointer is shifted within the function
enddo
! OR
do i = 1, n
   ps => strs(i)(1:)
   rval(i) =  str2real_p( ps )
enddo

(@everythingfunctional maybe this is in the lines of what you were looking for? I also found a way to simply return the last position with the same algorithm, but forgetting about elemental also)

And here the results I got with the benchs:

--------------------------------------------------------------------------------
BLOCK 4: F str2real
         Read: time consumed =  0.0330  seconds (serial)
         Read: time consumed =  0.0330  seconds (array)
A = Max |rval(:)-rref(:)| = 2.220E-16,   B = epsilon(1.0d0) = 2.220E-16,   B-A =  0.000E+00

--------------------------------------------------------------------------------
BLOCK 5: F str2real_p
         Read: time consumed =  0.0360  seconds (serial)
         Read: time consumed =  0.0340  seconds (stream)
A = Max |rval(:)-rref(:)| = 2.220E-16,   B = epsilon(1.0d0) = 2.220E-16,   B-A =  0.000E+00

In both cases I managed to be faster than the latest post (previous result at 0.046s ±)
I also included returning NaN or HUGE(1.d0) in case of nan or infinities.

As you will see, I no longer use the equivalence function, I just roll out the integer interpretation loops, and included a small function to find the first-non-white-space: ‘mvs2nwsp’

I’m very curios about your thoughts on this

@tqviet What I have seen in my tests is that my times change quite a bit between runs (± 6%).

everythingfunctional · January 17, 2023, 2:05pm

The pointer approach is very much in line with how the fastfloat library works. Having seen you use it, it seems it works better than my initial instinct. My last question is would it be possible to add an optional stat argument to indicate failure to parse?

hkvzjal · January 17, 2023, 2:38pm

Yes, with the pointer version I tried something like this:

function str2real_p(s,stat) result(r)
    character(len=:), pointer :: s
    real(wp) :: r
    integer,intent(inout), optional :: stat
...
    if(present(stat)) stat = 0
end function

And that seems to work just fine, would have to define different error codes depending on what happened and measure possible drops in performance.

everythingfunctional · January 17, 2023, 2:59pm

The fastfloat library only used std::errc::invalid_argument, and the Fortran standard generally only states that error condition values are positive, but not what their values actually are, so any positive integer is fine. You can try and go farther and define specific values for specific conditions, but it’s not necessary.

hkvzjal · January 18, 2023, 8:14am

You can check out the latest implementation str2num_m

The core algorithm is now in an elemental subroutine:

elemental subroutine str2real_base(s,r,p,stat)
    ! -- In/out Variables
    character(*), intent(in) :: s !< input string
    real(wp), intent(inout)  :: r !< Output real value
    integer(1), intent(out)  :: p !< position within the number
    integer(1), intent(out)  :: stat !< status upon success or failure to read
...
end subroutine

And both user interfaces are kept with the functions

elemental function str2real(s) result(r)
    ...
    call str2real_base(s,r,p,stat)
end function

function str2real_p(s,stat) result(r)
    character(len=:), pointer :: s !< input string
    ...
    call str2real_base(s,r,p,err)
    p = min( p , len(s) )
    s => s(p:)
    if(present(stat)) stat = err
end function

I did nothing fancy regarding the error status for the moment. just a constant value at the beginning and 0 if success

Robpk · January 24, 2023, 4:55pm

In your library, is is intended that the string "1-3" is parsed as 1.0e-3? Similarly with a dot e.g. "2.-6" is parsed as 2.0e-6.
Apparently using the intrinsic read(s, *) x where s is the string and x a real variable, it does the same, accepts the string even though there is no e character. Is that a bug? Or intended? (mentioned in the standard perhaps).
Printing or using variables and typing 1-6 you get -5 (the result of the subtraction), so I am confused.

Also, using “too many” digits in the string you’ll get out-of-bounds access to the powers-of-10 arrays (the base array is unused by the way?). You might want to keep track of the number of digits read to avoid that.

hkvzjal · January 24, 2023, 11:07pm

I’m as baffled as you, I was not aware that using read on s=‘1-3’ would return 1.0e-3, it does indeed and testing the str2real also produced the same output… but this was not intentional. kind of mixed feelings here, I think I would prefer an error rather than a success in reading…

tested 1.-3, same thing on Intel, but with gfortran 11 I get an error Fortran runtime error: Bad real number in item 1 of list input…

You are right, since there is a counter it is possible to keep track and skip the last part of the number if too many digits are written, since in any case all that accuracy is lost when converting to the final real(wp) (wp=8). Have to add a couple of tests on that. The base array is a legacy from the first versions that I kept in the module to try and re-merge the different strategies in a library.

Thanks a lot for the heads up

RonShepard · January 25, 2023, 12:34am

I’m not sure what the consensus is on this issue, but I would point out that it is consistent with E format output where the exponent width is too small for the letter, sign, and number. In that case the letter is dropped, and the sign and exponent value are written, so no information is lost. If the exponent field is still too small, then the entire field is filled with *, losing the information. If it is possible to convert those numbers, it would be symmetrical with the previous write that produced them.

For the issue of storing intermediate values in integers, that always means that those values are limited by the integer range. If the string of digits is converted and stored as floating point numbers, then the only digits that are lost are those that would be lost anyway. Consider storing, say, a 30-digit value into eventually a 32-bit real. There is likely no integer kind large enough to store that 30-digit value. But if it is converted directly into the 32-bit real, then there would be no problem with integer overflow. Of course, each step of that conversion would require a floating point multiply and add, so it would be slower than the corresponding integer conversion.

hkvzjal · January 25, 2023, 7:51am

Good point, not sure either …

For this, the strategy I found to work quite nicely was to read the fractional part of the number with an integer(wp) that should enable the precision required for the end real. Currently only implemented for real(8), so let’s say we want to read the following value:

s = '0.123456789123456789123456789123456789'

a formatted read will throw

real(8) :: x
...
read(s,*) x 
write(*,*) x !< 0.12345678912345678 the value is truncated at the 17th decimal place

by iterating over the string (and making sure to finish the loop at the limit of the integer(8) :: int_wp) I would get first and int=0 and int_wp = 1234567891234567891 and then
r = int + int_wp * 1d-19 = 0.12345678912345678.
The idea was to use the fractional base multiplication and conversion just once at the end.

To enable for larger real types indeed we should extend the fractional base for this idea to work, but I guess it should in principle ( ? )

Robpk · January 25, 2023, 1:38pm

If I read correctly (Agner’s tables), in modern processors, FMA is faster than int -> double conversion. The former has latency of 4 or 5 (reciprocal throughput of 0.5), whereas the latter has 6 (reciprocal of 1 or 2). That’s for scalar conversion (not packed/SIMD).
Interesting.

Had to double check. Using gfortran 12.2.0 that string gets parsed to 1.0e-3. No warnings.

You can use one integer only. For "123.456e7" you might have int = 123456 and a counter for the digits after the dot. The parsed number then is 123456 * 10**(7 - k), where k = 3 in this case. No idea which way is faster/more accurate, haven’t tested myself.

RonShepard · January 26, 2023, 1:13am

This program shows what list-directed i/o does with various strings.

program efmt
   real :: x
   character(:), allocatable :: str
   str = '1.e-3'  ! decimal, with e
   read(str,*) x
   write(*,*) x
   str = '1.-3'   ! decimal, no e
   read(str,*) x
   write(*,*) x
   str = '1-3'    ! no decimal, no e
   read(str,*) x
   write(*,*) x
   str = '1234567890123456789012345678901234567890-9'    ! long string, no decimal, no e
   read(str,*) x
   write(*,*) x
end program efmt

With NAG and gfortran, the output is something like:

$ nagfor efmt.f90 && a.out
NAG Fortran Compiler Release 7.1(Hanzomon) Build 7114
[NAG Fortran Compiler normal termination]
   1.0000000E-03
   1.0000000E-03
   1.0000000E-03
   1.2345679E+30

One might expect a string-to-real function to do similar conversions.

hkvzjal · January 26, 2023, 7:59am

Thanks for the reference!!

In light of the evidences I’ll just be glad that inadvertelly got the same output

I took your advice and rearranged the core code, it is compact and cleaner, but didn’t gain any performance (except for some flukes in which running times would drop to 0.02 or 0.016, but mostly around 0.032)

Adding a check on the maximum admissible lenght of the integer(wp) for handling long strings did drop significantly the perfs (would increase to [0.044;0.047]s). Have to rethink the loops…

hkvzjal · February 1, 2023, 9:11pm

I gave some thoughts to the last discussions regarding long strings and tried a small change to the algorithm (you can check out the full version in the github):

elemental subroutine str2real_unroll(s,r,p,stat)
...
! read whole and fractional number in a single integer
pP = p
int_wp = 0
do i = p, min(19+p-1,len(s)) !< stop at the max precision available for the integer(8)
     val = iachar(s(i:i))-digit_0
     if( val >= 0 .and. val <= 9 ) then
           int_wp = int_wp*10 + val
     else if( val == period ) then
           pP = i
     else
           exit
     end if
end do
p = i

With this modification, the following test will pass:

call check("0.123456789123456789123456789123456789")

as I’ll get the same read as formated read : “0.123456789123457”

The test suggested by @RonShepard won’t pass:

call check("1234567890123456789012345678901234567890-9")

as I’ll get : “1.23456789012346” , while intel or gfortran in windows or linux will give “1.2345679E+30”

Any ideas here? (maybe the second part of the algorithm can be tweaked without loosing generality and performance?)

The good thing is that so far I managed to stay within the same performance rate.

Robpk · February 2, 2023, 10:00pm

After the loop

do i = p, min(19+p-1,len(s))

you go into an if that expects part of the exponential section, either e, E, + or -. Which in the case of

"1234567890123456789012345678901234567890-9"

does not happen. You may need to keep reading characters (digits) and discarding them if you have read enough for the set precision, until you find e, E, + or - (deal with exponent) or the end of the string (be done).

RonShepard · February 2, 2023, 10:06pm

Also must check for ‘d’ and ‘D’.

hkvzjal · October 30, 2023, 5:33pm

I made a few updates to the str2real function here: GitHub - jalvesz/Fortran-String-to-Num

All the tests pass even for long strings, and all four [‘e’,‘E’,‘d’,‘D’] are used to check for the exponents.

Was wondering, anyone interested in double checking? And I was thinking that this could be part of the stdlib, any opinions?

freevryheid · October 30, 2023, 6:42pm

I was reading this thread this morning before the last post and thinking that it would be nice to have this in the stdlib since we have to_string for the reverse. Encouraging to see development continue. Not sure about the difference in the algos posted by @Carltoffel and @hkvzjal

hkvzjal · October 30, 2023, 7:25pm

You can checkout both implementations in the file str2num_m.f90. When I picked up the project I started from @Carltoffel’s implementation then moved to the “rollout” one as I found out it was not only faster but I managed to streamline several numbers in a single string.

Topic		Replies	Views
Speed of atoi and atof vs. internal read	14	1705	May 6, 2022
Best way to read only last two real values in a line using read in Fortran Help	26	4214	July 15, 2022
Best way to declare a double precision in Fortran? Help	51	20605	April 7, 2024
A New JSON Library Announcements	47	3882	November 8, 2021
Question about double precision on ARM processors	52	1923	November 19, 2023

Faster string to double

Related topics