Reading new line or tab characters from file

I’m reading a file into a character string like this:


character*50, intent(in) :: flname
character(len=:), allocatable :: str
integer size

inquire(file=flname, size=size)  
allocate(character(len=size)::str)              

open(1, file=flname, status='OLD', access='STREAM') 
read(1) str
close(1)

I’m trying to loop over the character string and detect certain characters, including ones like the new line (‘\n’) or tab (‘\t’) characters. But for some reason, I cannot detect those characters in a file. Is Fortran automatically ignoring these characters and if so, how can I get it to detect them?

1 Like

Welcome to the forum!

Can you show us how you are looping over the string? You could use the index function or scan/verify, depending on your needs. As you are reading the file as a sequence of bytes, the entire contents of the file should be contained in the variable str. Can you check the length of the string? I have seen - in a fairly distant past - that compilers did not always returned the right size of a file. Maybe that is the case here as well.

1 Like

I don’t seem to have problems detecting them in rojff. Perhaps you could look there for some inspiration?

Specifically, the file reading logic is here.

@Arjen sure… I’m working through Crafting Interpreters, So there is a loop over the characters in the string. It is something like this:

character(len=:),allocatable :: c
integer :: i, size

do i = 1,size
    c = str(i:i)
    ! case select depending on what the character is a sample of which might look like this:
    select case (c)
    case ('\n')
        ! do something here
    case default
          continue
    end select
    write(*,*) c  ! just writing out to the console
end do

When I run this on a file containing new line characters, the write statement writes a blank line instead of a character. (and my case select statement cannot detect a newline character). It writes out all the text characters in the file except for the new line characters.

This likely the problem. You are searching for a two-character sequence instead of the desired one-character sequence. You need something like

case(achar(9))    ! horizontal tab
...
case(achar(10))   ! new line

in the select case block. BTW, the syntax for the select case is wrong, but that is a separate issue.

3 Likes

Yes, indeed. Fortran does not use the C escape sequences ;). You may want to consult the ISO_C_BINDING module, as that defines a number of such sequences as “ordinary” parameters.

Yeah, that was just a snippet of code. I edited back in the select case line.

Thanks for that. achar() was what I needed. It seems to be working now!

Edit: code corrected per @RonShepard 's note below.

@FluxCapacitor ,

Welcome to this Discourse.

Re: “reading a file into a character string”, since you’re using STREAM access, I suggest looking into parameterizing your string size based on the bits using by the processor for the file storage units and the data type, note in principle they can be different:

   .
   use, intrinsic :: iso_fortran_env, only : FSZ => file_storage_size, CSZ => character_storage_size
   .
   character(len=:), allocatable :: str
   integer size
   .
   inquire(file=flname, size=size)  
   allocate( character(len=size*FSZ/CSZ) :: str ) !<-- Note the parameterization
   .                                                           

As advised to you upthread, you can use named constants, particularly from the intrinsic module, to help with readability.

If a companion C processor is present with your Fortran processor, you can do for example:

   .
   use, intrinsic :: iso_c_binding, only : TAB => C_HORIZONTAL_TAB, NL => C_NEW_LINE
   .
   select case ( c )
      case ( TAB )
         ! do the needful
      case ( NL ) 
         ! do the needful
      case default
         .
   end select
   .
2 Likes

That expression might be backwards, you might need size*FSZ/CSZ. For example, if size=100, FSZ=32, and CSZ=8, then the right character length should be 400.

Also, beware that some compilers need special flags in order for this expression to work correctly. If you are using the intel compiler, then you need the -assume byterecl option, in which case FSZ=8, CSZ=8 and the size value returned by inquire would be 400. Without that option the actual file storage size is 32, the size value returned would be 100, but the file_storage_size parameter is still 8, so the character array would be a factor of 4 too small.

1 Like

Yes, that’s correct.