Today I learned about an odd feature of list-directed input.
Consider the following file (“strings.txt”)
1*5 aa aaa
2*1 b bb
3*2 ccc c
Let’s say I am only interested in the second and third columns. My usual approach is to have a dummy character variable, used as follows:
implicit none
character :: dummy
character(8) :: column1, column2
integer :: i
open (100, file='strings.txt')
do i = 1, 3
read (100, *) dummy, column1, column2
print *, column1, column2
end do
close (100)
end
My hope is for the following output:
aa aaa
b bb
ccc c
But this is not what the program produces. Instead, I get
aa aaa
1 b
2 2
How bizarre!
It turns out that the asterisk (and some other characters) have special meaning in this context. From this Intel documentation page I gather that the asterisk signifies a repeat-count. I don’t fully understand how it is meant to be used, but it was quite confusing! I wonder how widely known/used this feature of list-directed input is.
In my case, this behavior was not desired. As a workaround, I read each line into a long character variable and replaced each asterisk with a more benign character. Then I could do an internal list-directed read on the sanitized line.
Anyway, I just wanted to share this little adventure, and I am curious if anyone else has ever been surprised by this behavior or actually found use in it.
I agree it’s surprising, but it is consistent with the first two examples of the overloaded asterisk in the code
program test23stars
character:: b,c,d='?'
integer j,k,n(2)
data j,k/2*0/ ! repeated value in data (3)
call input(n,'2*4',b,c,d,*666)! repeated value in list-directed input,
C ! alternate return
...
Repeat counts apply to both list-directed i/o and to namelist i/o. It was originally a convenience for people who read input from punched cards. I think it is also allowed for the processor to use repeat counts on output, so it is self-consistent. On input, something like 5*, will skip over five entries, the same as five repeated commas.
I worked on a project that made substantial use of this feature for their custom format input files.
Edit to say: I don’t think it was a good idea, and people shouldn’t do this anymore. We have fairly standardized file formats for things these days (i.e. json, yaml, etc.).
One of the things I remember from working on the help desk at Imperial College was the complexity of Fortran I/O and how easy it was for users to get things wrong. Lots of gotchas.
There are more odd behaviors of list-directed input, including treatment of undelimited character strings, null values, and / as a terminator. While list-directed input is sometimes very handy, if you understand all of what it does, it often trips up people.
It gets worse if the implementation is more liberal - for many years, DEC Fortran allowed free conversion between logical and numeric, so that a bare T in the input stream would convert to -1 - fortunately that is no longer the default in Intel Fortran.
The biggest mistake I see people make is to delegate error checking to list-directed input. I keep telling people that it is far more accepting than you probably want.
And then there’s all the complaints about list-directed output not formatting things the way people want.
Another situation where list-directed output does not suffice: a complex program that is being developed with a set of test problems, and the output files are to be generated with different compilers or different compiler options, for comparison with a reference output file set.
File comparison utilities rarely think that 3.14159274 and 3.14159 are the same (exceptions: ndiff and numdiff on Linux).