Inconsistent EOR and EOF conditions with non-advancing I/O

Does anybody know, whether end-of-record or end-of-line is the right one to report by the iostat value of a read() command, when reading the last line of a file, which ends without trailing newline using non-advancing I/O. I have found different behavior with different compilers.

  • Compiler 1: First read returns the line and reports end-of-record, next read returns nothing and reports end-of-file.

  • Compiler 2: First read returns the line and reports end-of-line immediately.

The following snippet demonstrates the issue

Minimal working example
program testnaio
  implicit none

  integer :: unit, iostat
  character(:), allocatable :: line

  open(newunit=unit, file="test.txt")
  do
    call get_line_nonadvanced(unit, line, iostat)
    print "(3a, i0)", "Read |", trim(line), "| ", iostat
    if (is_iostat_end(iostat)) exit
  end do
  close(unit)

contains

  subroutine get_line_nonadvanced(unit, line, iostat)
    integer, intent(in) :: unit
    character(:), allocatable, intent(out) :: line
    integer, intent(out) :: iostat

    character(10) :: buffer
    integer :: nread

    line = ""
    do
      read(unit, "(a)", advance="no", iostat=iostat, size=nread) buffer
      if (iostat > 0) return
      line = line // buffer(1:nread)
      if (iostat < 0) return
    end do

  end subroutine get_line_nonadvanced

end program testnaio

Reading the attached text file (consisting of a single line without newline), I obtain different outputs:

  • Compiler 1
    Read |Some content (without newline)| -2
    Read || -1
    
  • Compiler 2
    Read |Some content (without newline)| -1
    

(The text between the |'s is the content read, the numbers indicate the returned iostat values.

Are both behaviors standard conforming?

test.txt (30 Bytes)

1 Like

Why is the program not standard conforming? Can you ellaborate more on it?

Oh, I see, thanks for the clarification! However, the standard seems to explicitely distinguish between “error condition”, “end-of-file” condition and “end-of-record” condtion. Especially

12.11.5 IOSTAT= specifier

Execution of an input/output statement containing the IOSTAT= specifier causes the stat-variable in the IO-
STAT= specifier to become defined with

• a zero value if neither an error condition, an end-of-file condition, nor an end-of-record condition occurs,
[…]
• a processor-dependent positive integer value different from IOSTAT_INQUIRE_INTERNAL_UNIT if any
other error condition occurs,

• the processor-dependent negative integer value of the constant IOSTAT_END (16.10.2.16) from the intrinsic
module ISO_FORTRAN_ENV if an end-of-file condition occurs and no error condition occurs,

• the processor-dependent negative integer value of the constant IOSTAT_EOR (16.10.2.17) from the intrinsic
module ISO_FORTRAN_ENV if an end-of-record condition occurs and no error condition or end-of-file
condition occurs, or […]

If I interpret the this correctly, negative values signalize, that no error occurred, so that buffer(1:nread) should be well defined in those cases.

The question is rather, whether

  • in the case of non-advancing I/O

  • when reading the last (non-empty) line of a file, which does not end with newline

one can expect to obtain

  • an end-of-record iostat-value first (with nread > 0)

  • and then an end-of-file at the next invocation of read() (with nread = 0)

or is it also standard conforming, if one obtains

  • an end-of-file iostat-value already at the first invokation of read() (with nread > 0).

After re-reading 12.11 again, 12.11.1/2 seems to be relevant:

An end-of-record condition occurs when a nonadvancing input statement attempts to transfer data from a position
beyond the end of the current record, unless the file is a stream file and the current record is at the end of the
file (an end-of-file condition occurs instead).

Does this mean, that in cases like above (where sequential access is used), the standard conforming behavior would be to have an end-of-record first and only then end-of-file?

One catch is that end-of-record and end-of-file conditions are specified by the Fortran standard, but error conditions are processor-dependent. In particular if EOR or EOF has already happened then different compilers may differ on what to do with the next attempted input from the same unit. There has been a long discussion in comp.lang.fortran recently involving this point.

I agree, that once end-of-file occured, it is not standard conforming to read from the file once more. However, in the case of end-of-record (when using non-advanced I/O), it should be possible. Section 12.11.4 says

If an end-of-record condition occurs during execution of an input/output statement that contains either an EOR= specifier or an IOSTAT= specifier, and an error condition does not occur then:
[…]
(4) the file specified in the input statement is positioned after the current record;

To me it sounds, as it would be OK to continue reading, starting with the next record.

However, as for the original question, I have found yet another section which may be relevant:

6 During nonadvancing input when the pad mode has the value NO, an end-of-record condition (12.11) occurs if the input list and format specification require more characters from the record than the record contains, and the record is complete (12.3.3.4). If the record is incomplete, an end-of-file condition occurs instead of an end-of-record condition.

7 During nonadvancing input when the pad mode has the value YES, blank characters are supplied by the processor if an effective item and its corresponding data edit descriptors require more characters from the record than the record contains. If the record is incomplete, an end-of-file condition occurs; otherwise, an end-of-record condition occurs.

If I interpret this correctly, when the file ends with an incomplete record (in my case a line without newline at the end), one should directly obtain an end-of-file, and not first an end-of-record and then an end-of-file.

I think, this is not the case. I’ve just found, that if PAD is not specified in the open() command, it defaults to YES. And according to 12.11.4 (on end-of-record condition)

if the pad mode has the value
(a) YES, the record is padded with blanks to satisfy the effective item (12.6.4.5.3) and correspond-
ing data edit descriptors that require more characters than the record contains,
(b) NO, the input list item becomes undefined;

So the content of buffer(1:nread) would be well defined, wouldn’t it?

If nread is a positive number then the concatenation line//buffer(1:nread) is longer than line and you lose the contents of buffer in the assignment, which I suspect isn’t what you want. Whether its assignment to line is undefined requires more of a search of the standard than I am prepared to do.

Please note, that line is allocatable, so it will be automatically reallocated to the correct length to contain both, its previous content, as well as buffer(1:nread). So, nothing gets lost IMO.