Best way to read only last two real values in a line using read in Fortran

FYI: To keep the example simpler I removed a DO loop that stepped through and replaced all the obviously non-numeric characters with an X which eliminates the issues for ",',*,/.

Here is an attempt, by identifying the tokens without “read(line,*) token”. It uses two simple DO loop scans of the line string.
It first replaces all other delimiters with a blank then searches for the start and end of all tokens. The list of possible delimiters can always change !
It then identifies the start and end of all tokens, testing if the Fortran READ accepts the token as a real number.

This approach does not include some useful free format extensions, such as:

  • the use of , , to identify null tokens
  • the use of arithmentc operators + - / * ^ for sumple numeric calculations.
  • coping with ascii HT char(9)
      character :: line*128
      character :: blank        = ' '
      character :: delimiters*5 = ' ,()='
      character :: list_of_tokens(64)*32
      real      :: list_of_reals(64), val
      integer   :: nt, nr, fc, k, iostat

      line = ' ( v=  1) w = 2.843 unit1 = 5.867 unit2  4.567 1.232 )'
      write ( *,10) 'Initial line :', trim(line)

!   replace other delimiters with blank
      do k = 1, len_trim(line)
        if ( index (delimiters, line(k:k)) > 0 ) line(k:k) = blank
      end do
      write ( *,10) 'Without delim:', trim(line)

!   find all tokens in filtered line
      nt = 0    ! number of tokens
      nr = 0    ! number of reals
      fc = 0    ! first character of next token

      write ( *,10) 'Search for Tokens'
      do k = 1, len_trim(line)+1                ! line must end with a blank

        if ( fc == 0 ) then
          if ( line(k:k) /= blank ) fc = k      ! have start of toktn

        else if ( line(k:k) == blank ) then     ! have end of token
          nt = nt+1
          list_of_tokens(nt) = line(fc:k)       ! add token to list
          read (list_of_tokens(nt), fmt='(f20.0)', iostat=iostat ) val
          if ( iostat == 0 ) then
            nr = nr+1
            list_of_reals(nr) = val             ! add reaL to list
            write ( *,11) nt, trim(list_of_tokens(nt)), '   val =',val
            write ( *,11) nt, trim(list_of_tokens(nt))
          end if
          fc = 0    ! set for start of next token
        end if
      end do

      write ( *,10) 'Summary'
      write ( *,11) nt,'tokens identified'
      write ( *,11) nr,'real values identified'

 10   format (/a,a)
 11   format ( i4,2x,a,a,g10.4 )
1 Like

My 2c:

  • the length in f descriptor in
    read (list_of_tokens(nt), fmt='(f20.0)', iostat=iostat ) val
    should match the length of list_of_tokens(nt) (now set to 32), otherwise a token like 1234567890123456789012.34 will get truncated not only in terms of significant digits but also in the order of magnitude:
    4 1234567890123456789012.843 val =0.1235E+20 (should be 0.1235E+22)
    this could be fixed by using dynamic format
character ::  valfmt*10
! [...]
write(valfmt,"('(F',I0,'.0)')"), len(list_of_tokens(1))
! [...]
read (list_of_tokens(nt), fmt=valfmt, iostat=iostat ) val
  • based on my experience with gfortran (see also here) I would add semicolon to the list of delimiters, possibly replacing space which is redundant there.
  • @JohnCampbell’s code would be better readable if started with ```fortran. As it is now, the language has not been recognized properly by the server

@msz59 Thanks for your comments.

I do agree that there is a problem if real number strings longer than 20 digits are provided but I am not convinced by the solution. A Warning probably should be provided (we are using a 64-bit real)

You could add ; or : or / (date/time) or ~ to the list, but these can be specific delimiter cases.
In the code example, the first delimiter is blank, which replaces all other delimiters supported. This implies repeated delimiters are merged to a single delimiter. Use of multiple ;;; might not be interpreted this way.

For the data formats I use, comma and HT are interpreted differently as when repeated they do indicate significant null token fields. My simplified code is ambiguous for these special types.

Should ```fortran be manually applied after selecting the </> icon ?
Thanks for the suggestion.

I don’t get this. Why not use the same length for the token string and format? F20.0 is not enough for a 64-bit real anyway. The default list-formatted output of such a value is 1.2345678901234568E+021 (23 characters)

This would make sense if you wrote

if ( index (delimiters, line(k:k)) > 1 ) line(k:k) = delimiters(1:1)

As it is now, it just blindly replaces any delimiter (including space) by a space. Semicolon is a legal fields separator in case when DECIMAL='COMMA' is activated. F2018 standard says:

13.6.2 If the decimal edit mode is COMMA during list-directed input/output, the character used as a value separator is a semicolon in place of a comma.

gfortran recognizes a semicolon as a separator even if DECIMAL has the default value of POINT (by an extension or error :slight_smile: )

I never use that icon, typing ``` by hand but as I see now, clicking </> produces a pair of
triple back-quote sequences, so you would have to add fortran to the first of those. See the short discussion here. I suggested making fortran the default language on this particular discourse but maybe it is not (easily?) doable.

I’m old and old-fashioned but I think that this stuff is tedious and dumb

implicit none
real :: x, y
character(len=10) :: char1

If you know what column this is in, use T format and skip the stuff above

msz59 will be glad to know I reported the gfortran bug with semicolon separators a few weeks ago but I don’t think the bug-fix has yet been incorporated into the compiler. Also even 23 spaces for a real number was nowhere near enough recently when I wanted to see how two different algorithms compared when working in quad precision.