Is there any reliable way of printing a problematic line in a `namelist` file?

Hello,

I would like to detect and print a line that is causing a reading error in a namelist, for instance:

&config
  param1 = 10
  param2 === 0.5 ! problematic line 
  param3 = "Hello"
/

I’d like my input parsing code to print the problematic line. I understand that this is not possible to do reliably with backspace (Solved: Re: backspace with namelist read - Intel Community). Do you know any neat way of doing this?

Thanks!

Hopefully, if you use the IOMSG option on your READ you will get a useful message including the location. If not, it depends on how hard it is to re-read just
the NAMELIST group you want (note that a BACKSPACE of a NAMELIST file is
NOT supported by the standard – if it works it is an extension). Assuming you have just one in a file it is easy to rewind the file and start by reading one line into
an internal file and then adding a last line of “/” and reading from that, increasing
one line at a time till an error occurs.

You would probably read it normally and then only try to slurp the file into an internal file if an error occurs, but for a
simple example (ignoring the procedure that reads the file into memory for simplicity) it might go something like this:

ASCII file slurper
module m_io
implicit none
character(len=*),parameter   :: gen=   '(*(g0))'
private
public :: fileread
contains
subroutine fileread(filename,pageout)
character(len=*),intent(in)              :: filename   
character(len=:),allocatable,intent(out) :: pageout(:) 
character(len=1),allocatable             :: text(:)    

   call filebyte(filename,text) 
   if(.not.allocated(text))then
      write(*,*)'*fileread* failed to load file '//filename
   else  
      pageout=topage(text)
      deallocate(text) 
   endif

contains
function topage(array)  result (table)
character(len=1),intent(in)  :: array(:)
character(len=:),allocatable :: table(:)
integer                      :: i
integer                      :: linelength
integer                      :: length
integer                      :: lines
integer                      :: linecount
integer                      :: position
integer                      :: sz
character(len=1),parameter   :: nl = char(10)
character(len=1),parameter   :: cr = char(13)
   lines = 0
   linelength = 0
   length = 0
   sz=size(array)
   do i = 1,sz
      if( array(i) == nl )then
         linelength = max(linelength,length)
         lines = lines + 1
         length = 0
      else
         length = length + 1
      endif
   enddo
   if( sz > 0 )then
      if( array(sz) /= nl )then
         lines = lines+1
      endif
   endif

   if(allocated(table))deallocate(table)
   allocate(character(len=linelength) :: table(lines))
   table(:) = ' '

   linecount = 1
   position = 1
   do i = 1,sz
      if( array(i) == nl )then
         linecount=linecount+1
         position=1
      elseif( array(i) == cr )then
      elseif( linelength /= 0 )then
         table(linecount)(position:position) = array(i)
         position = position+1
      endif
   enddo
end function topage

end subroutine fileread

subroutine filebyte(filename,text)
character(len=*),intent(in)              :: filename 
character(len=1),allocatable,intent(out) :: text(:) 
integer                                  :: nchars=0 
integer                                  :: igetunit 
integer                                  :: ios=0 
integer                                  :: i 
character(len=256)                       :: message 
character(len=4096)                      :: label 
character(len=:),allocatable             :: line 
   message=''
   open(newunit=igetunit, file=trim(filename), action="read", iomsg=message,&
   &form="unformatted", access="stream",status='old',iostat=ios)
   label=filename
   if(ios == 0)then  
      inquire(unit=igetunit, size=nchars)
      if(nchars <= 0)then
         write(*,gen) '*filebyte* empty file '//trim(label) 
         return
      endif

      if(allocated(text))deallocate(text) 
      allocate ( text(nchars) )           
      read(igetunit,iostat=ios,iomsg=message) text      
      if(ios /= 0)then
         write(*,gen)'*filebyte* bad read of '//trim(label)//':'//trim(message)
      endif
   else
      write(*,gen)'*filebyte* '//message
      allocate ( text(0) )           
   endif

   close(iostat=ios,unit=igetunit)            

end subroutine filebyte
end module m_io
program main
use M_io,      only : fileread
implicit none
character(len=*),parameter   :: gen=   '(*(g0))'
character(len=:),allocatable :: page(:) , test(:)
integer                      :: param1 ; namelist /config/ param1
real                         :: param2 ; namelist /config/ param2
character(len=256)           :: param3 ; namelist /config/ param3
character(len=256)           :: iomsg
integer                      :: iostat
integer                      :: i
   ! allocate character array and copy file into it
   call fileread('namelist.nml',page)
   if(.not.allocated(page))then
      write(*,gen)'<ERROR>*whichline* failed to load file'
      stop 1
   endif

   read(page,nml=config,iostat=iostat,iomsg=iomsg)
   if(iostat.ne.0)then
      write(*,gen)'<ERROR>*whichline* '//trim(iomsg)
      write(*,gen)'<ERROR>*whichline* locating error:'
      do i=1,size(page)
         test=[character(len=len(page)) :: page(:i),'/']
         read(test,nml=config,iostat=iostat,iomsg=iomsg)
         if(iostat.ne.0)then
            write(*,gen)'<ERROR>*whichline* on line ',i,' : ',trim(page(i))
            !write(*,gen)'<ERROR>*whichline* '//trim(iomsg)
            stop 2
         endif
      enddo
   endif
   deallocate(page)  ! release memory
end program main

So given the assumption you can isolate just the namelist with the error in it,
and that the input file is really large and often hand-edited so errors like this
occur in the first place, that might work. We have a program where the code itself does not do that check, but if a user composes or changes an input file they have
to validate it using a program that has the same namelist defined in it similar to the above program before they insert it into a bigger multi-case input file.

How well the compiler IOMSG works is very compiler-specific but that is simple when it works well. If you have that (rare(?)) situation where you are making really big NAMELIST inputs at least partially manually maybe an approach like the pre-tester described above is worth it. The syntax of a NAMELIST input is structured well enough it can also probably be syntax-checked relatively easily reading it in and checking if it following rules byte-by-byte. That will not find type mismatches and unknown or misspelled variable names and so on but would find something like the ===.

Would need more info on exactly why spotting errors by eye along with the IOMSG is not sufficient, and whether you only have one set per file and how much time or impact these kinds of errors are having to say if any of that is worth it or not

Sample output

================================================================================
&config
  param1 = 10
  param2 === 0.5 ! problematic line
  param3 = "Hello"
/

================================================================================
<ERROR>*whichline* namelist read: misplaced = sign
<ERROR>*whichline* locating error:
<ERROR>*whichline* on line 3 :   param2 === 0.5 ! problematic line
1 Like

I am curious how well various compilers do with a minimal reproducer:

program main
implicit none
character(len=*),parameter   :: gen=   '(*(g0))'
character(len=:),allocatable :: page(:)
integer                      :: param1 ; namelist /config/ param1
real                         :: param2 ; namelist /config/ param2
character(len=256)           :: param3 ; namelist /config/ param3
character(len=256)           :: iomsg
integer                      :: iostat
page=[character(len=80) :: &
'&config                              ', &
'  param1 = 10                        ', &
'  param2 === 0.5 ! problematic line  ', &
'  param3 = "Hello"                   ', &
'/' ]
   read(page,nml=config,iostat=iostat,iomsg=iomsg)
   if(iostat.ne.0)then
      write(*,gen)'<ERROR> '//trim(iomsg)
   endif
end program main

gfortran

<ERROR> namelist read: misplaced = sign

flang

<ERROR>   Bad real input data at column 11 of record 3

Any others?

1 Like

https://degenerateconic.com/namelist-error-checking.html

1 Like

Hmmm. Reading the standard I think depending on a backspace on list-directed and NAMELIST input is non-standard; on the other hand it seems to work with several compilers when a file (backspace does not apply to reading from a file like stdin or an internal file).so that is easier. Another thing I think is non-standard is if you print the namelist the first value not changed is the one the error occurred on as well.

1 Like

I also wrote a small reproducer:

implicit none

integer :: unit, ios, param1
real :: param2
character(10) :: param3
character(255) :: errmsg

namelist /config/param1, param2, param3

open (NEWUNIT = unit, FILE = 'config.nml', STATUS = 'OLD', ACTION = 'READ', IOSTAT = ios, IOMSG = errmsg)
if (ios /= 0) error stop 'OPEN: '//trim(errmsg)

read (unit, config, IOSTAT = ios, IOMSG = errmsg)
if (ios /= 0) error stop 'READ: '//trim(errmsg)

close (unit)

end

And it seems the error is always reported without issue (at least for the compilers installed on my machine), but only ifort/ifx/flang-new messages are actually useful.

$ gfortran config_namelist.f90 && ./a.out 
ERROR STOP READ: namelist read: misplaced = sign

Error termination. Backtrace:
#0  0x7f9397221b9a in ???
#1  0x7f9397222699 in ???
#2  0x7f9397223a17 in ???
#3  0x55b4c3c9e5f8 in ???
#4  0x55b4c3c9e675 in ???
#5  0x7f9396f4dd67 in __libc_start_call_main
	at ../sysdeps/nptl/libc_start_call_main.h:58
#6  0x7f9396f4de24 in __libc_start_main_impl
	at ../csu/libc-start.c:360
#7  0x55b4c3c9e120 in ???
#8  0xffffffffffffffff in ???

$ ifort -diag-disable=10448 config_namelist.f90 && ./a.out 
READ: syntax error in NAMELIST input, unit -129, file /home/jwm/tests/config.nml, line 3, position 12

$ ifx config_namelist.f90 && ./a.out 
READ: syntax error in NAMELIST input, unit -129, file /home/jwm/tests/config.nml, line 3, position 12

$ flang-new config_namelist.f90 && ./a.out 
Fortran ERROR STOP: READ: Bad real input data at column 12 of record 3

$ nvfortran config_namelist.f90 && ./a.out 
ERROR STOP READ: syntax error - entity name expected
1 Like

Interesting. Using an internal file ifx(1) always seems to say line -1, while flang shows the right position with regular files as well as internal; but interesting survey
particularly because everyone gives a different result. Perhaps an enhancement request for gfortran to show the position is justified, as it actually does indicate the problem is an = character which is perhaps the best answer if it only said where it found it at. Weird how it seems to be the rule that the line the error occurred at is not just echoed back, although that info might be lost internally, as processing might have lost that information depending on how the NAMELIST group is actually being read.

1 Like

Thank you @urbanjost and @jwmwalrus for the answers and nice tests! My conclusions are:

Pity that such an important feature is not yet ubiquitous across compilers.

FYI: Added a request for enhancement to gfortran

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118793

LFortran does not do NAMELIST yet but maybe there is someplace the example program could go as a confidence test for later use; cannot seem to get my nvidia ID to work at the moment so if anyone wants to add it feel free. Curious about NAG, Cray, IBM, … behavior.

About the backspace – seems to be a usable behavior on at least several compilers even though not standard. It is not forbidden by the standard, just not required so it could be treated as a common extension. It seems like it could be tested for by reading an intentional error in a scratch file at run time and then doing the same on the actual file if it works if important enough to someone, as NAMELIST scans a file for the beginning of a NAMELIST list so a backspace would only get complicated if the error occurred on the first line, assuming the program did not stop on the error in the first place.

I think that part of the i/o specification in fortran is to allow the maximum flexibility to the library in processing records (not just namelist, but regular read statements too). For example, the whole namelist block could be read into an internal character buffer, discarding the <lf> characters along the way, and then the parsing could be done from that internal buffer. The actual file pointer would be after the end / in that case. Or it could be done line by line, in which case the actual file pointer would be on the line with the error. Or it might be done in groups of, say 8 lines each, in which case the actual file pointer might be several lines beyond the error, but still within the namelist block. Another consideration is that the namelist block might be read from a nonseekable device, in which case backspace/rewind/etc. would not be supported at all. When you consider all possible device types, pipes, here documents, and other types of command line redirections, that is a lot of possibilities where backspace might or might not be allowed, and the fortran program itself cannot know in advance what it will encounter at run time.

Of course, this could be imposed by the standard at some future time, but you can see why it might not have been done up to this point in time.

1 Like

Yes, that is perhaps why several compilers might not be reporting the line number and column when a syntax error is encountered. I have had to deal with something like that before where it ended up each character had to be a user-defined type where the original line number and column position were stored with the character
so I know it might not just be so simple to report where an error occurred, but that
it is possible even when everything is being jumbled about.

So the backspace cannot be literally applied even to stdin and internal files but NAMELIST input in a file is still predominantly the most common case, and it worked with several compilers. I had not tried that myself because I already had in mind that is was not defined by the standard, but seeing it work with several compilers and not having a great alternative if the compiler does not already show the position of the error it is worth considering in a specific case. If used, it definitely needs a few comments for the next person who looks at the code.