Use of Inquire to find Next record

urbanjost · November 18, 2021, 4:46pm

Partly due to just how many kinds of architectures Fortran has been around figuring out the number of records in a direct access file can be platform-specific. It can depend on which compiler and compiler options you choose and what hardware you are operating on. But usually now-adays query the size of the file and divide by the record length in bytes and you probably have the number of records.

Just be aware that not all OPEN statements are using bytes as the unit for the RECL option and that some rare environments that store files in interesting ways, usually for optimization purposes can cause other problems.

So if you are using one compiler with compiler options you are specifying this is easy; else it gets into a much longer discussion. I put a few hints on what else to look up if you do not have the simple case:

program show
use ISO_FORTRAN_ENV, only : FILE_STORAGE_SIZE
implicit none
integer :: nexrec, ios, lun, i, sz, likely
character(len=256) :: message
integer,parameter  :: ll=160
character(len=ll) :: line
   ! first program makes a file
   open(newunit=lun, file='DBPP', access='direct',  &
   & status= 'replace', recl=ll,form='unformatted', &
   & iostat=ios, iomsg=message)
   do i=26,1,-1
      line=repeat(char(ichar('a')+i-1),ll)
      write(lun,rec=i)line
   enddo
   close(lun)
   write(*,*)'made and closed sample file'
   INQUIRE(IOLENGTH=likely) line
   ! second program reads a file
   open(newunit=lun, file='DBPP', access='direct',  &
   & status= 'old', recl=ll,form='unformatted', &
   & iostat=ios, iomsg=message)
   if(ios.eq.0)then
      inquire(unit=lun, nextrec=nexrec,size=sz)
      print *,'the next record is ', nexrec
      print *,'the size is ', sz
      print *,'the file storage unit size is ', file_storage_size
      print *,'the iolength of a line is', likely
      if(likely.ne.ll)then ! not in bytes
      print *,'and your file is probably bigger than you need'
      print *,'because your RECL units are not bytes'
      print *,'the number of lines is probably ',sz/ll/(ll/likely)
      else
         print *,'your file probably has', sz/ll, 'records'
      endif
   else
      print *, '<ERROR>'//trim(message)
      stop 1
   endif
   print *, 'ready'
end program show

One common case is when using ifort(1) it depends on whether you compile with -assume byterecl or not, and so on, but probably all you need is:

program show
implicit none
integer ::  ios, lun, sz
character(len=256) :: message
integer,parameter  :: ll=160
   open(newunit=lun, file='DBPP', access='direct',  &
   & status= 'old', recl=ll,form='unformatted', &
   & iostat=ios, iomsg=message)
   if(ios.eq.0)then
      inquire(unit=lun, size=sz)
      print *,'the size is ', sz
      print *,'your file probably has', sz/ll, 'records'
   else
      print *, '<ERROR>'//trim(message)
      stop 1
   endif
   print *, 'ready'
end program show

urbanjost · November 18, 2021, 5:43pm

PS: works with three compilers with switches as shown

 ifort exa2.f90 -assume byterecl
urbanjs@venus:~$ ./a.out
 the size is         4160
 your file probably has          26 records
 ready
urbanjs@venus:~$ gfortran exa2.f90
urbanjs@venus:~$ ./a.out
 the size is         4160
 your file probably has          26 records
 ready
urbanjs@venus:~$ nvfortran exa2.f90
urbanjs@venus:~$ ./a.out
 the size is          4160
 your file probably has           26 records
 ready

how many bytes does your file have, and what compiler command(s) are you using?

Here is an example of a compile where the RECL length is not bytes, that shows an error I find a lot where the code will run just find but often 3/4 of the file is unused data, just to show an example where it is not that simple:

ifort exa.f90 -assume nobyterecl
urbanjs@venus:~$ ./a.out
 made and closed sample file
 the next record is            1
 the size is        16640
 the file storage unit size is            8
 the iolength of a line is          40
 and your file is probably bigger than you need
 because your RECL units are not bytes
 the number of lines is probably           26
 ready

That is a simple mistake and a bigger deal when the files are a Terabyte in size! The file only needs to be 4096 bytes but in this case is four times larger; but the code will run just fine.

One of my favorites with direct access files is where people do something like ENTRY=ENTRY+1 and write to REC=ENTRY but ENTRY was never initialized, often because someone assumed it would always be initialized to zero. Programs crash, get wrong answers, create bad output files, fill file systems, exceed file size limits or even crash machines when ENTRY is not initially zero; from just one uninitialized value.

miramar46 · November 18, 2021, 8:22pm

I am using Lahey-Fujitsu v7.3 for 32 bit windows. I don’t have any special compiler options but I did find in the Users Guide the following that I think are applicable to my situation:
d[num] 1
The d option determines the size of the input/output work area used by a direct access input/
output statement. The d option improves input/output performance when data is read from or
written to files a record at a time in sequential record-number order. If the d option is specified,
the input/output work area size is used for all units used during execution.
To specify the size of the input/output work area for individual units, specify the number of
Fortran records in the environment variable FUnnBF where nn is the unit number (see“Environment
Variables for Input/Output” on page 136 for details). When the d option and
environment variable are specified at the same time, the d option takes precedence. The
optional argument num specifies the number of Fortran records, in fixed-block format,
included in one block. The optional argument num must be an integer from 1 to 32767. To
obtain the input/output work area size, multiply num by the value specified in the RECL=
specifier of the OPEN statement. If the files are shared by several processes, the number of
Fortran records per block must be 1. If the d option is omitted, the size of the input/output
work area isFUnnBF = size
The FUnnBF environment variable specifies the size of the input/output work area used by a
sequential or direct access input/output statement. The value nn in the FUnnBF environment
variable specifies the unit number. The size argument used for sequential access input/output
statements is in kilobytes; the size argument used for direct access input/output statements is
in records. The size argument must be an integer with a value of 1 or more. A size argument
must be specified for every unit number.
If this environment variable and the g option are omitted, the input/output work area size used
by sequential access input/output statements defaults to 1 kilobytes. The size argument for
direct access input/output statements is the number of Fortran records per block in fixedblock
format. The size argument must be an integer from 1 to 32767 that indicates the number
of Fortran records per block. If this environment variable and the d option are omitted, the
area size is 1K bytes.
According to the operating system this file DBPP is 3840 bytes with only the 25 records it has now.
When I first created the file DBPP, Inquire said the RECL=160 ; so for 25 records that would be 4000 bytes. Which agrees with 4160 you got for 26 records. Do you think I should try these compiler switches d[num]1 and FUnnBF? Thanks for your help. Floyd

urbanjost · November 19, 2021, 12:51am

Those parameters are useful for optimizing I/O by changing the cache area size used for I/O buffering. The online manual indicates specifically that RECL= is in bytes for both OPEN and INQUIRE so the last example should work as-is for your file. It is not guaranteed by the Fortran standard but the file is very likely a stream of bytesn on your local file system and will be of the size 160*number of lines and since all direct access records are by definition the same size, and particularly because your record length is a multiple of four and you are on a standard platform type it is extremely likely the last sample program will work. It is highly likely that the file only contains 24 lines, and that if you except 25 that is probably an issue with the program that wrote the file, not the above technique. So with the caveat that it is not portable to all platforms just query the size with INQUIRE() and divide by the length of your records (160) and you should get the number of records. Does the last sample program work and show the size as 3840 and the number of lines as 24?

miramar46 · November 19, 2021, 11:37am

Some of the specifiers in your code don’t work on my compiler. I looked in the Language Reference and both iomsg and size are not allowed in the open statement. I did add the INQUIRE for iolength but I still get nexrec=1 and iolength=0
I did not add the code where you make the sample file writing from 26 to 1 in minus one step??
What is the purpose of that? I already have the file written and I know it contains 25 records that are good. I have another program that gets the data from ‘DBPP’ and writes it to an output file. The results are all correct. I don’t understand the reason why your program works and mine doesn’t
Floyd

mecej4 · November 19, 2021, 1:08pm

Miramar46,

A nice discussion of the properties of direct access files, dating back 17 years, may be useful to you.

My view is that the entire task is easily accomplished with a text editor or a spreadsheet program, given that the whole “database” is less than 100 records of 160 characters each. If you insist on writing a Fortran program for maintaining such short files, note that direct access gives you insignificant benefits over sequential access, and the limitations of direct access files make them unsuited for the intended application.

Consider, as well, that a commodity computer today has about 8 GB or RAM, of which at least 1 or 2 GB can be used to hold millions of 160-byte records. It is, therefore, convenient and less error-prone to read the entire data into memory, do all the addition-deletion-modification operations in memory, and write the revised data back to a new version of the data file. We do not have to perform record-oriented I/O as if we were using a 1960-s business oriented computer.

miramar46 · November 19, 2021, 1:19pm

I am beginning to think of switching to sequential since direct access has been so difficult.
However, I don’t think I can access data from a spreadsheet into a fortran program. Am I wrong?
I want to write some programs which need some not all of these physical properties in the file ‘DBPP’. The choice of which compounds to be read into the program depends on the user.

Thank you for your help,
Floyd Pfeffer

Beliavsky · November 19, 2021, 1:26pm

I save Excel files as csv files and read those from Fortran. Discussion of reading .xls files directly using Intel Fortran is here. Excel can export files to XML, which there are Fortran libraries to read.

mecej4 · November 19, 2021, 1:28pm

I don’t use spreadsheets, but I believe that any spreadsheet program in use today will have the capability of writing (“exporting”) data of the kind that you have into a CSV (comma-separated-values) file, which can be read as a formatted sequential file by a Fortran program.

JohnCampbell · November 26, 2021, 2:19am

Another complexity, it is my understanding is that direct access files do not have to be written sequentially. (Happy to be corrected if I am wrong about this)
This means that Nextrec is not essential for extending a direct access file.
What is important for you to know is which records have been written and are still valid.
Records can be written, read or updated in any order.

One of the key limitations of direct access files is that record lengths are fixed as all being the same.

I am at present experimenting with stream I/O, which also has a direct addressing capability for random I/O but also supports variable length records. You must manage the individual record addresses and properties. So, providing you keep a table of records, their size and location, this can be a much more flexible record structure format. I expect there is a limitation that stream I/O files must be written sequentially, so Last_Rec (or Last_Pos) is an important property.
You can also construct records with a header:data:trailer structure to enable sequential access or reconstructing the record table. The header and trailer can be an 8-byte integer to remove any 2-gbyte size limits as can occur with some 32-bit (derived) Fortran compilers.
This record structure can be managed via library routines, while retaining Fortran read/write I/O list flexibility for the data component.
Lots of potential for full 64-bit support.

Topic		Replies	Views
Direct access file "Non-existing record number" Help	5	196	March 19, 2025
Should INQUIRE have an ISATTY parameter?	1	466	September 15, 2020
Namelist Error (Fortran runtime error: End of file) Help	5	149	January 18, 2025
Some explanation on do loop Help	7	328	April 10, 2024
Problem with inquire with ifort 19.1 in Japanese OS Intel	1	100	August 21, 2024

Use of Inquire to find Next record

Related topics