FYI: To keep the example simpler I removed a DO loop that stepped through and replaced all the obviously non-numeric characters with an X which eliminates the issues for ",',*,/.
Here is an attempt, by identifying the tokens without “read(line,*) token”. It uses two simple DO loop scans of the line string.
It first replaces all other delimiters with a blank then searches for the start and end of all tokens. The list of possible delimiters can always change !
It then identifies the start and end of all tokens, testing if the Fortran READ accepts the token as a real number.
This approach does not include some useful free format extensions, such as:
- the use of , , to identify null tokens
- the use of arithmentc operators + - / * ^ for sumple numeric calculations.
- coping with ascii HT char(9)
character :: line*128 character :: blank = ' ' character :: delimiters*5 = ' ,()=' character :: list_of_tokens(64)*32 real :: list_of_reals(64), val integer :: nt, nr, fc, k, iostat line = ' ( v= 1) w = 2.843 unit1 = 5.867 unit2 4.567 1.232 )' write ( *,10) 'Initial line :', trim(line) ! replace other delimiters with blank do k = 1, len_trim(line) if ( index (delimiters, line(k:k)) > 0 ) line(k:k) = blank end do write ( *,10) 'Without delim:', trim(line) ! find all tokens in filtered line nt = 0 ! number of tokens nr = 0 ! number of reals fc = 0 ! first character of next token write ( *,10) 'Search for Tokens' do k = 1, len_trim(line)+1 ! line must end with a blank if ( fc == 0 ) then if ( line(k:k) /= blank ) fc = k ! have start of toktn else if ( line(k:k) == blank ) then ! have end of token nt = nt+1 list_of_tokens(nt) = line(fc:k) ! add token to list read (list_of_tokens(nt), fmt='(f20.0)', iostat=iostat ) val if ( iostat == 0 ) then nr = nr+1 list_of_reals(nr) = val ! add reaL to list write ( *,11) nt, trim(list_of_tokens(nt)), ' val =',val else write ( *,11) nt, trim(list_of_tokens(nt)) end if fc = 0 ! set for start of next token end if end do write ( *,10) 'Summary' write ( *,11) nt,'tokens identified' write ( *,11) nr,'real values identified' 10 format (/a,a) 11 format ( i4,2x,a,a,g10.4 ) end
- the length in
read (list_of_tokens(nt), fmt='(f20.0)', iostat=iostat ) val
should match the length of
list_of_tokens(nt)(now set to 32), otherwise a token like
1234567890123456789012.34will get truncated not only in terms of significant digits but also in the order of magnitude:
4 1234567890123456789012.843 val =0.1235E+20(should be 0.1235E+22)
this could be fixed by using dynamic format
character :: valfmt*10 ! [...] write(valfmt,"('(F',I0,'.0)')"), len(list_of_tokens(1)) ! [...] read (list_of_tokens(nt), fmt=valfmt, iostat=iostat ) val
- based on my experience with
gfortran(see also here) I would add semicolon to the list of delimiters, possibly replacing space which is redundant there.
- @JohnCampbell’s code would be better readable if started with ```fortran. As it is now, the language has not been recognized properly by the server
@msz59 Thanks for your comments.
I do agree that there is a problem if real number strings longer than 20 digits are provided but I am not convinced by the solution. A Warning probably should be provided (we are using a 64-bit real)
You could add ; or : or / (date/time) or ~ to the list, but these can be specific delimiter cases.
In the code example, the first delimiter is blank, which replaces all other delimiters supported. This implies repeated delimiters are merged to a single delimiter. Use of multiple ;;; might not be interpreted this way.
For the data formats I use, comma and HT are interpreted differently as when repeated they do indicate significant null token fields. My simplified code is ambiguous for these special types.
Should ```fortran be manually applied after selecting the </> icon ?
Thanks for the suggestion.
I don’t get this. Why not use the same length for the token string and format? F20.0 is not enough for a 64-bit real anyway. The default list-formatted output of such a value is
1.2345678901234568E+021 (23 characters)
This would make sense if you wrote
if ( index (delimiters, line(k:k)) > 1 ) line(k:k) = delimiters(1:1)
As it is now, it just blindly replaces any delimiter (including space) by a space. Semicolon is a legal fields separator in case when
DECIMAL='COMMA' is activated. F2018 standard says:
13.6.2 If the decimal edit mode is COMMA during list-directed input/output, the character used as a value separator is a semicolon in place of a comma.
gfortran recognizes a semicolon as a separator even if
DECIMAL has the default value of
POINT (by an extension or error )
I never use that icon, typing ``` by hand but as I see now, clicking </> produces a pair of
triple back-quote sequences, so you would have to add
fortran to the first of those. See the short discussion here. I suggested making
fortran the default language on this particular discourse but maybe it is not (easily?) doable.
I’m old and old-fashioned but I think that this stuff is tedious and dumb
implicit none real :: x, y character(len=10) :: char1 If you know what column this is in, use T format and skip the stuff above
msz59 will be glad to know I reported the gfortran bug with semicolon separators a few weeks ago but I don’t think the bug-fix has yet been incorporated into the compiler. Also even 23 spaces for a real number was nowhere near enough recently when I wanted to see how two different algorithms compared when working in quad precision.