Rewind of input_unit

It is sometimes useful to read some or all of the input to a program twice. What is specified in the fortran standard about

   rewind input_unit

where input_unit is preconnected to stdin. I have found that it works as expected in most cases, but I’m curious if there are limits on the file size or other circumstances where this might not be true. I’m particularly interested in the situations where stdin is an input pipe, or redirection from a unix shell, or a here document in a shell.

program <input

first | program

program <<EOF
input
records
EOF

Are there POSIX specifications that affect when this does and doesn’t work?

1 Like

A quick-and-dirty search led me to this post: Is there a way to rewind stdin in C? - Stack Overflow. I have not looked for any formal standard, but my intuition says that you simply cannot rely on such a feature.

1 Like

I know of no specification that says that standard-input must be buffered, let alone that the buffer be accessible to consuming programs or have a minimum size. If anyone knows of one I’d be interested to hear about it. Thus, once read, I don’t think there’s any guarantee that the data of the input stream continues to exist anywhere.

1 Like

Does your comment apply to all three of the cases I asked about, or to just the pipe and here document forms? Does standard input in the case of redirection (my first example) give stdin all the characteristics of the connected file itself?

As a practical matter, this is what gfortran does for the three cases for a small test file:

program stdin
   use, intrinsic :: iso_fortran_env, only: input_unit, output_unit
   implicit none
   integer :: i, j
   character(*), parameter :: cfmt = '(*(g0))'
   do i = 1, 5
      read(input_unit,*) j
      write(output_unit,cfmt) 'line ', i, ' read. j= ', j
   enddo
   rewind input_unit
   write(output_unit,cfmt) 'after rewind'
   do i = 1, 5
      read(input_unit,*) j
      write(output_unit,cfmt) 'line ', i, ' read. j= ', j
   enddo
end program stdin

Here is an input file:

cat >input.dat <<EOF
1
2
3
4
5
EOF

Redirection:

$ a.out <input.dat
line 1 read. j= 1
line 2 read. j= 2
line 3 read. j= 3
line 4 read. j= 4
line 5 read. j= 5
after rewind
line 1 read. j= 1
line 2 read. j= 2
line 3 read. j= 3
line 4 read. j= 4
line 5 read. j= 5

Pipe:

$ cat input.dat | a.out
line 1 read. j= 1
line 2 read. j= 2
line 3 read. j= 3
line 4 read. j= 4
line 5 read. j= 5
At line 10 of file stdin.f90 (unit = 5, file = 'stdin')
Fortran runtime error: Illegal seek

Error termination. Backtrace:
#0  0x1049f380e
#1  0x1049f44b5
#2  0x1049f50cb
#3  0x1049e9c3b
#4  0x1049e9e5c

Here document:

$ a.out <<EOF
1
2
3
4
5
EOF
line 1 read. j= 1
line 2 read. j= 2
line 3 read. j= 3
line 4 read. j= 4
line 5 read. j= 5
after rewind
line 1 read. j= 1
line 2 read. j= 2
line 3 read. j= 3
line 4 read. j= 4
line 5 read. j= 5

Maybe this is a question for the compiler writers. What exactly is the difference in these three cases regarding the fortran i/o library? How much of this behavior is determined/required by the fortran and POSIX standards? How much is just a quality of implementation issue for the various compilers?

1 Like

You are entirely at the mercy of the underlying operating system. Some kinds of files can be rewound, some can’t. None of this is specified by the Fortran standard. 12.3.1 says:

At any given time, there is a processor-dependent set of allowed access methods, a processor-dependent set of allowed forms, a processor-dependent set of allowed actions, and a processor-dependent set of allowed record lengths for a file.
NOTE 1
For example, the processor-dependent set of allowed actions for a printer would likely include the write action, but not the read action.

If re-reading the input is desired, I suggest saving it in an array or a scratch file that can be rewound.

1 Like

Yes, those are possible options. However, if it is not necessary, then it is much simpler to just rewind and reread the input. As you know, it is not straightforward to make an exact byte-for-byte copy of a file in fortran, there are all kinds of record length and significant blank issues that arise.

I am probably most interested in the redirection case, program < input. In this case the fortran input_file is connected to the external file input. If that file were opened in the usual way, say with action='read', then it would be allowed to rewind it and read it again. Does the same thing happen when that file is connected through the preconnected input_file?

FYI, the ifort compiler does the same thing as gfortran for the above program. Namely, redirection and the here document work, but the pipe fails. I’m trying to understand how reliable and portable that behavior is.

1 Like

Preconnection isn’t anything special. What counts is what’s being (pre)connected to. If it’s a normal file, then you should typically expect to be able to rewind it. A pipe (or console), probably not.

2 Likes

Making an exact copy of the input can be done with the program below:

! read_and_copy.f90 --
!     Try and copy a redirected file
!
program read_and_copy
    use iso_fortran_env

    integer          :: ierr
    character(len=1) :: byte

    open( 10, file = 'copy.out' )

    do
        read( input_unit, '(a)', advance = 'no', iostat = ierr ) byte

        select case ( ierr )
            case ( 0 )
                 write( 10, '(a)', advance = 'no' ) byte
            case ( iostat_eor )
                write( 10, '()' )
            case ( iostat_end )
                 exit
        end select

    enddo
end program read_and_copy

I tested it with both gfortran and Intel Fortran oneAPI on Windows. (Note the end-of-record case: Intel behaved differently than gfortran when I had a “*” instead of the empty format)

2 Likes

Ok, here is another quirky thing about fortran i/o. Instead of the program just stopping after copying the file, suppose you want instead to leave the input file positioned at its end, so that it could be appended to later. Is the above the correct way to terminate, or should it be

   case ( iostat_end )
      backspace
      exit

The question is, right after the iostat_end is generated, is the file left positioned before or after the (virtual) ENDFILE record? Does advance='no' affect the answer to that question?

1 Like

I am more or less convinced (based on anecdotal evidence only) that the OS may interfer, so that is not purely a Fortran issue. But a practical solution might be to use BACKSPACE and keep your fingers crossed.

The Fortran standard is very specific about this. When an end-of-file condition is detected, “if the file specified in the input statement is an external record file, it is positioned after the endfile record”. (12.11.3). BACKSPACE is then required (12.3.4.3p3) to position to before the endfile record so that writes can be done.

ADVANCE='NO' has no effect on this.

That’s… interesting. With list-directed, Intel adds a blank for records that would otherwise be empty. The standard does say, for list-directed output, “each output record begins with a blank character”, but why it should do this only for empty records I don’t understand. I’m hesitant to call it a bug, however - list-directed leaves a lot up to the processor. FWIW, NAG Fortran behaves here the same as gfortran.

This part of the discussion has stayed away from the original input_unit question.

Thanks for the clarification. I have always thought this was clear too, but in the past I have seen inconsistent behavior. Consider this program:

program backspace_test
   use iso_fortran_env, only: output_unit, iostat_end
   implicit none
   integer :: n, istat
   character(80) :: line
   character(*), parameter :: cfmt='(a)'
   open(newunit=n, file='input.txt')
   do     ! skip to the end of the file.
      read(n,cfmt,iostat=istat) line
      if ( istat == iostat_end ) exit
      write(output_unit,cfmt) line
   enddo
   backspace n  ! this should backspace to before the endfile record.
   backspace n  ! this should backspace to before the last record.
   read(n,cfmt) line
   write(output_unit,cfmt) 'after 2 backspaces, the line is:', line
end program backspace_test

with this file

$ cat >input.txt <<EOF
first line
second line
last line
EOF

I just tested recent versions of gfortran and ifort, and they both write the last line of the file at the end, as they should. But I definitely have used compilers in the past that would write out the second line instead.

With recent fortran revisions, this possible incompatibility can be bypassed. I think the statement

   open(unit=n, position='append')

can replace the whole do loop if all it does is search for the endfile, and will leave the file positioned correctly.

As for the list directed write, I agree that it is pretty much up to the compiler whether or not to write the space. In fact, a compiler could probably write extra spaces if it wanted, and maybe even extra blank lines, and still conform to the standard. I pretty much avoid using list directed writes for this reason for anything nontrivial.

I’ve been doing some more experimentation with @Arjen 's program and have convinced myself that ifort is correct to write a blank in the cases where the list-directed write starts the record. I’m going to ask Malcolm why nagfor doesn’t.

If I remember correctly, I have used that in the past to produce an empty line. I did not ask myself whether that was the correct behaviour, I immediately admit. I only stumbled upon the difference between the two compilers, when I wanted an exact copy of the input.

Malcolm seems to believe that nagfor’s (and gfortran’s) behavior is acceptable and referenced trailing blanks in his response. Both compilers DO insert a leading blank if the record is not zero-length - I find this bizarre, but am not going to pursue it further, other than to note that the standard requires that no leading blank be output when a character value is continued - itself a rather inconsistent requirement.