Hello guys,
So I worked a bit more on this topic and wanted to share my latest results. In order to limit this post blowing up with all the code I forked @Carltoffel 's project so that you can find the latest version here
In the project you will find the latest str2real function discussed here named as ‘str2real_ch’ in the module str2num_m.f90 . and the new ones: ‘str2real’ and ‘str2real_p’. These two are basically the same function (I’ll explain the reason for both, but would like to bring back to just one)
In the original implementation we could work on an array like this:
rval(:) = str2real ( strs(:) )
But actually, I have found that it is way faster when reading an ASCII file to load the whole file in memory in a single big string and then roll it out, rather than just jumping line by line. So, I needed a way of stream-lining through the chain. And the elemental definition was blocking me from doing it properly. in the ‘str2real_p’ (for pointer) enables to do it by simply:
use str2num_m, only: str2real_p !< pointer version of the str2real function
character(:),allocatable,target :: strs_seq !< Original string
character(len=:), pointer :: ps !< Working pointer
real(8), allocatable :: r(:)
...
ps => strs_seq(1:)
do i = 1, n
rval(i) = str2real_p ( ps ) !< the pointer is shifted within the function
enddo
! OR
do i = 1, n
ps => strs(i)(1:)
rval(i) = str2real_p( ps )
enddo
(@everythingfunctional maybe this is in the lines of what you were looking for? I also found a way to simply return the last position with the same algorithm, but forgetting about elemental also)
And here the results I got with the benchs:
--------------------------------------------------------------------------------
BLOCK 4: F str2real
Read: time consumed = 0.0330 seconds (serial)
Read: time consumed = 0.0330 seconds (array)
A = Max |rval(:)-rref(:)| = 2.220E-16, B = epsilon(1.0d0) = 2.220E-16, B-A = 0.000E+00
--------------------------------------------------------------------------------
BLOCK 5: F str2real_p
Read: time consumed = 0.0360 seconds (serial)
Read: time consumed = 0.0340 seconds (stream)
A = Max |rval(:)-rref(:)| = 2.220E-16, B = epsilon(1.0d0) = 2.220E-16, B-A = 0.000E+00
In both cases I managed to be faster than the latest post (previous result at 0.046s ±)
I also included returning NaN or HUGE(1.d0) in case of nan or infinities.
As you will see, I no longer use the equivalence function, I just roll out the integer interpretation loops, and included a small function to find the first-non-white-space: ‘mvs2nwsp’
I’m very curios about your thoughts on this
@tqviet What I have seen in my tests is that my times change quite a bit between runs (± 6%).