Reading a file with NUL values into an array

Hi,
I am dealing with a Fortran code that reads numeric data from a file that is generated from another software. The data looks like this:


The headings are stripped off and handled separately and the numeric data is read into an array using this routine:

SUBROUTINE ReadR4Ary ( UnIn, Fil, Ary, AryLen, AryName, AryDescr, ErrStat, ErrMsg, UnEc )


   ! This routine reads a AryLen values into a 4-byte real array separated by white space
   ! (possibly on the same line of the input file).


   ! Argument declarations:

INTEGER,      INTENT(IN)          :: AryLen                                     ! Length of the array.
INTEGER,      INTENT(IN)          :: UnIn                                       ! I/O unit for input file.
INTEGER,      INTENT(IN),OPTIONAL :: UnEc                                       ! I/O unit for echo file. If present and > 0, write to UnEc
INTEGER,      INTENT(OUT)         :: ErrStat                                    ! Error status
CHARACTER(*), INTENT(OUT)         :: ErrMsg                                     ! Error message


REAL(SiKi), INTENT(INOUT)         :: Ary(AryLen)                                ! Real array being read.

CHARACTER(*), INTENT(IN)          :: Fil                                        ! Name of the input file.
CHARACTER(*), INTENT(IN)          :: AryDescr                                   ! Text string describing the variable.
CHARACTER(*), INTENT(IN)          :: AryName                                    ! Text string containing the variable name.


   ! Local declarations:

INTEGER                      :: Ind                                             ! Index into the real array.  Assumed to be one digit.
INTEGER                      :: IOS                                             ! I/O status returned from the read statement.



READ (UnIn,*,IOSTAT=IOS)  ( Ary(Ind), Ind=1,AryLen )

CALL CheckIOS ( IOS, Fil, TRIM( AryName ), NumType, ErrStat, ErrMsg )
   IF (ErrStat >= AbortErrLev) RETURN

DO Ind=1,AryLen
   CALL CheckRealVar( Ary(Ind), AryName, ErrStat, ErrMsg)
      IF (ErrStat >= AbortErrLev) RETURN
END DO

IF ( PRESENT(UnEc) )  THEN
   IF ( UnEc > 0 ) THEN
      WRITE( UnEc, Ec_ReAryFrmt ) TRIM( AryName ), AryDescr, Ary(1:MIN(AryLen,NWTC_MaxAryLen))
   END IF
END IF


RETURN
END SUBROUTINE ReadR4Ary

The READ statement loads the data into the array “Ary”. If the READ is successful - IOS = 0 and if not successful the IOS would be some integer number.
The next line CALL CheckIOS checks about this and returns errror if IOS is not 0,

Now in the new version of the software, there are NUL characters introduced in the output file.

Now the line in the above code:

READ (UnIn,*,IOSTAT=IOS)  ( Ary(Ind), Ind=1,AryLen )

Sets IOS to some integer value (because the read failed) and thereby throws error.

My question is

  1. Is there a way to remove the NUL values in the file even before sending the file to this routine, so that I need not modify this routine ?
  2. If that’s not feasible how can I modify this code to be able to accommodate NUL values ?

Link to the new output file with NUL values

Thank you,
Ashok

My solution would be to read the lines into a character string, replace the NUL characters in that string and then read the data from that sanitised string. You can use list-directed input on a string, so the fix should be extremely limited :slight_smile:

Do you control the generation of the input file? It looks like an
uninitialized character variable is being printed by mistake and should
be corrected.

If you are in a ULS(Unix-Like System (including WSL)) there are many
commands to stream-edit files. Some common ones would be

# if the literal string NUL actually appears in the file replace it with a space
sed -i -e 's/NUL/ /g' FILE
# if you mean there are NULL characters in the file
tr '\000' ' ' < FILE > FILE.new

Or you do not have a ULS you can create a little filter program of your own.

A Fortran program to replace invisible or non-ASCII characters:

program scrub
use, intrinsic :: iso_fortran_env, only : stderr=>ERROR_UNIT
integer,parameter :: longest=255  ! set to longest
character(len=longest) :: line
character(len=255) :: iomsg
integer :: i, iostat, ich
   do
      READ (*,'(a)',IOSTAT=IOStat,iomsg=iomsg)  line
      if(iostat==0)then
         do i=1,len(line)
           ich=ichar(line(i:i))
           if(ich<32.or.ich>126) line(i:i)=' '
         enddo
         write(*,'(a)')trim(line)
      else
         write(stderr,'(a)')trim(iomsg)
         stop iostat
      endif
   enddo
end program scrub

Or one example of what to replace your READ with, assuming
there are non-printable characters in it.
```fortran
   SCRUBLINE: BLOCK
   character(len=255) :: line
   integer :: i, ios
   READ (UnIn,'(a)',IOSTAT=IOS)  line
   if(iostat==0)then
      do i=1,len(line)
        if(ichar(line(i:i))<32) line(i:i)=' '
      enddo
      READ (line,*,IOSTAT=IOS)  ( Ary(Ind), Ind=1,AryLen )
   endif
   endblock SCRUBLINE

If the NUL strings are always in the same columns you can just read those
columns you want in and build the line to read the numeric values from,
or read the whole line in and compose a new line not including the nulls,
but just generically replacing non-printable characters is just as easy and
more general.

If you are replacing the string “NUL” instead of the null character and are not on Linux
and want to make your own filter using Fortran the replace(3f) and substitute(3f) procedures in M_strings are trivial to use as a dependency from *fpm if you build your Fortran programs with fpm assuming simple unconditional string replacement suffices.

1 Like

Thanks a lot @urbanjost and @Arjen .
@urbanjost , I am not on Linux so I will try your second solution (Arjen is also pointing to the same).
I will take a look at M_strings. Will keep you posted !

Yeah partly. Great insight. Let me see that also

You want to switch to a formatted read insted of list directed. That way the NUL bytes can bbe skipped.

becomes [quote]

READ (UnIn,‘(7(g15.0),1x)’,IOSTAT=IOS) ( Ary(Ind), Ind=1,AryLen )
[/quote]

which reads ;7 fields, each consistig of a 15 digit number and one character to be ignored.
See this for a descrirption of the format specifiers. in case I haven’t counted the characters properly. Note that if Anylen can be greater than 7, you would need to change the 7 to a number at least a large as AnyLen.

1 Like

Thank you @urbanjost for your insight.

Yes the issue was caused by a uninitialized variable in the routine that generates that file. Now it’s resolved.

Thanks @feenberg , your solution also works. Just that I don’t want to change that routine.

The “7” repeat count could be changed to * to accommodate arbitrary array lengths. The problem would then be what happens when the last number is read that is not followed by a space – that would result in an EOR type error. That can be fixed with the format (UnIn,‘(*(g15.0),:1x)’,IOSTAT=IOS). That : character tells the processor to stop if the input list is exhausted. Among other things, this shows how powerful fortran format statements can be.

Another minor modification I might suggest is to use an array expression instead of an implied do loop in the list. Depending on how the array is declared, and personal style preferences, the final read statement might be any of the following:

READ (UnIn,‘(*(g15.0),:1x)’,IOSTAT=IOS) Ary
READ (UnIn,‘(*(g15.0),:1x)’,IOSTAT=IOS) Ary(:)
READ (UnIn,‘(*(g15.0),:1x)’,IOSTAT=IOS) Ary(1:AryLen)

Note that the format statements do not really replace exactly the original list-directed i/o conventions, so there might be good reasons to keep the original instead. For example, if the field widths are not always exactly 15 characters, or if the input is sometimes spread over several input records, then you need the flexibility afforded by list-directed input.

1 Like