Clarification on expected behavior of ENDFILE

I am not a frequent user of the ENDFILE statement. Working with a code that uses it I am seeing
different behaviors from different compilers and am looking for clarification on what the standard
behavior is. Specifically the standard says …

Execution of an ENDFILE statement for a file connected for
sequential access writes an endfile record as the next record of
the file. The file is then positioned after the endfile record, which
becomes the last record of the file.

I am trying to interpret the intended meaning of “next record” as I
get different results truncating a sequential formatted file with
different compilers. Should the generated file from the following example have five or six lines in it?

$ ifort xx.f90 && ./a.out
 rewind and read           5 lines
           1         101
           2         102
           3         103
           4         104
           5         105
 number of lines in file was           10 , is now            5
$ gfortran xx.f90 && ./a.out

 rewind and read           5 lines
           1         101
           2         102
           3         103
           4         104
           5         105
           6         106
 number of lines in file was           10 , is now            6
program demo_endfile
implicit none
integer :: lun, i, j, iostat
integer,parameter:: isz=10
   !
   ! create a little scratch file
   open(newunit=lun,file='_scr.txt', form='formatted')
   write(lun,'(i0)')(100+i,i=1,isz)
   !
   ! write end of file after reading half of file
   rewind(lun)
   write(*,*)'rewind and read',isz/2,'lines'
   read(lun,*)(j,i=1,isz/2)
   endfile lun ! will truncate line at current position
   !
   ! NOTE: backspace before writing any addition lines
   !       once an ENDFILE(7f) statement is executed
   ! backspace(lun)
   !
   ! rewind and echo remaining file
   rewind(lun)
   j=0
   do i=1,huge(0)-1
      read(lun,*,iostat=iostat)j
      if(iostat.ne.0)exit
      write(*,*)i,j
   enddo
   write(*,*)'number of lines in file was ',isz,', is now ',i-1
   close(unit=lun,status='delete')
end program demo_endfile

The Fortran 95 Standard says, in Note 9.2:

“An endfile record does not necessarily have any physical embodiment. The processor may use a record count or other means to register the position of the file at the time an ENDFILE statement is executed, so that it can take appropriate action when that position is reached again during a read operation. The endfile record, however it is implemented, is considered to exist for the BACKSPACE statement (9.5.1).”

If I am not mistaken, neither Ifort nor Gfortran on Windows or Linux provides any “physical embodiment” to an end-of-file record, i.e., it is not possible to detect end of file by reading the file and looking for an EOF record. The underlying OS can be asked for the file size, and the RTL has to simulate the expected behavior associated with an end-of-file mark that actually had a physical embodiment.

I suspect a bug in the Gfortran implementation.

P.S.: This behavior has been reported on the GCC Bugzilla, and the responses there indicate differences of opinion on whether this is a bug or a feature.

1 Like

Curious if NVfortran/Cray/IBM/PGI/Lfortran get five or six lines in the file. If anyone can run the example with something other than gfortran/ifx it would be appreciated.

I ran @urbanjost’s test program with AMD/AOCC flang 4.2.0 in a Ubuntu 22.04 x86_64 system and got this output:

 rewind and read            5 lines
            1          101
            2          102
            3          103
            4          104
            5          105
 number of lines in file was            10 , is now             5

Same 5 line result when I tried compiling with g95 after making lun = 66 and changing newunit=lun to unit=lun because g95 does not have that f2008 feature.

I see the Bugzilla report for gfortran is the same issue; and that it looks like all other processors tested here take a different interpretation. Perhaps a clarification from the standards committee is required and/or a rephrasing in the standard. Thanks to everyone for testing processors; but it does appear to already be reported to gfortran, which I missed so I added a test similiar to the example program with a scratch file that gets called and saved once per program execution; and if it gets six instead of five the code does a backspace as a work-around till it is consistent.

For the young people out there mystified by what this is about, know that a 9-track tape drive can write a special record marking the end-of-file, and has commands to skip (forwards and backwards) past a specified number of EOF records. This allows multiple files to be stored on a single tape, with rapid access to individual file starts. At least rapid by 1960s standards. Files can be labeled, in which case the label is a small file itself. End-of-tape is indicated by 2 EOF records in a row.

I am a little surprised there is a use for the fortran endfile statement on a disk file - I don’t think the authors of the standard anticipated that use.

How else would one truncate an existing file? Fortran disk files (formatted, unformatted, sequential, direct access) were in common use in the late 1970s, when f77 was standardized with the endfile statement. Tapes were also still used, although mostly for backup and archival storage and transfer (e.g. by mail) by that time. Reading/writing tape files directly from a fortran program, e.g. to do merge sorts, was becoming less common by then. Tape capacities continued to increase and tapes remained the most effective archival medium all through the 1980s and 1990s. I think the last tape backup I made was about 2003 (it was a 40GB digital linear tape (DLT)) – after that I switched to using portable external drives or CD/DVD media.

Five records is the correct result, according to the standard; I don’t see that any clarification is needed. It is true that in most cases there is no physical “endfile record”, though historically DEC Fortran (and probably Intel today) treats a one-byte CTRL-Z as an endfile record (at least in formatted input), but it doesn’t write that.

I touched on a bit of additional history of ENDFILE in Doctor Fortran in “Military Strength” - Doctor Fortran (stevelionel.com) as “read past an endfile record” was part of MIL-STD-1753, but never adopted in the Fortran standard.

There are two ascii characters that might have been used to signify and end of file, z’03’ and z’04’, but mostly weren’t. The former is denoted ETX for end of text, and the latter is denoted EOT for end of transmission. ETX is generated with ^C (control and C pressed simultaneously on a keyboard) and EOT is generated with ^D. Most unix machines recognize ^C as an interrupt character (to stop a running process) and ^D as an end file character with terminal input. That is, typing ^D in a terminal to a fortran program will trigger the end= branch and set iostat to a negative value. The ^Z character, which is an ascii SUB or substitute character, was adopted by DEC to denote end of text in several of its operating systems (TOPS-10, TOPS-20, RSX-11, VMS, and probably many others) and also by CP/M and its various copies. I think CP/M, etc. also used that character within floppy-disk files to denote end of file.

A fix for ENDFILE has just been pushed to the gfortran mainline (version 14)

3 Likes