String manipulation topic

I brought up f-strings in another topic where @urbanjost answered and showed a very impressive way to produce very similar behavior with Fortran.

I finally had some spare time to play around and started thinking that what about “replace”. There is a very old issue at (still open) Replace substring in character/string_type · Issue #366 · fortran-lang/stdlib · GitHub. I would be nice to have such feature.

I also started wondering could one do something fun with operators. Here is something that I built on top of @urbanjost 's example in the other thread:

program main
implicit none

integer                      :: i
real                         :: a
character(len=:),allocatable :: text

character(len=4096) :: line
character(len=*),parameter   :: asis='(*(g0))'

type :: string_replace_t
  character(len=4096) :: str
  character(len=4096) :: replaced_substr
end type

type(string_replace_t) :: replacable

interface operator(.replace.)
  procedure create_replace_type
end interface

interface operator(.with.)
  procedure replace_type_with
end interface

   i=1
   a=1.123456789
   text="this"

  !py!print(f"{i} is an integer, {a:1.2f} is a rounded real and '{text}' is text")

   print asis,i," is an integer, ",fmt(a,'f0.2')," is a rounded real and '",text,"' is text"

  ! put it into a string with internal write
   write(line,asis)i," is an integer, ",fmt(a,'f0.2')," is a rounded real and '",text,"' is text"
   call doop(trim(line)) ! do something with line

  ! concatenation works too and can be passed as an argument, so more compact in some cases
   print asis,fmt(i)//" is an integer, "//fmt(a,'f0.2')//" is a rounded real and '"//text//"' is text"

   line=fmt(i)//" is an integer, "//fmt(a,'f0.2')//" is a rounded real and '"//text//"' is text"

   ! replace can be done with operators
   line = line.replace."text".with."other text"
   print asis, trim(line)

   ! other example:
   print asis,fmt(i)//" is an integer, "//fmt(a,'f0.2')//" is a rounded real and '"//&
     text//"' is text".replace."is a rounded real".with."IS A ROUNDED REAL"

contains

function create_replace_type(str, replaced_substr) result (sout)
  character(len=*), intent(in) :: str, replaced_substr
  type(string_replace_t) :: sout

  sout % str = str
  sout % replaced_substr = replaced_substr
end function create_replace_type

function replace_type_with(string_replace, with_str) result (sout)
  type(string_replace_t), intent(in) :: string_replace
  character(len=*), intent(in) :: with_str
  character(len=:), allocatable :: sout
  integer :: i, rstart, rend

  rstart = 0
  match_loop: do i = 1, len_trim(string_replace % str)
    if (rstart == 0 .and. string_replace % str(i:i) == string_replace % replaced_substr(1:1)) then
      rstart = i
    else if (rstart > 0 .and. len_trim(with_str)==i-rstart+1) then
      rend = i
      exit match_loop
    else if (rstart > 0 .and. string_replace % str(i:i) /= string_replace % replaced_substr(i-rstart+1:i-rstart+1)) then
      rstart = 0
    end if
  end do match_loop

  sout = string_replace % str(1:rstart-1) // with_str // string_replace % str(rend+1:len_trim(string_replace % str))
end function replace_type_with

subroutine doop(string) ! do some operation on string
character(len=*),intent(in) :: string
integer :: i
   write(*,asis)(string(i:i),i=len(string),1,-1)
end subroutine doop

recursive function fmt(generic,format) result (line)
use,intrinsic :: iso_fortran_env, only : int8, int16, int32, int64, real32, real64, real128

! ident_4="@(#) M_framework__msg fmt(3f) convert any intrinsic to a string using specified format"

class(*),intent(in)          :: generic
character(len=*),intent(in),optional  :: format
character(len=:),allocatable :: line
character(len=:),allocatable :: fmt_local
integer                      :: ios
character(len=255)           :: msg
character(len=1),parameter   :: null=char(0)
integer                      :: ilen
   if(present(format))then
      fmt_local=format
   else
      fmt_local=''
   endif
   ! add ",a" and print null and use position of null to find length of output
   ! add cannot use SIZE= or POS= or ADVANCE='NO' on WRITE() on INTERNAL READ,
   ! and do not want to trim as trailing spaces can be significant
   if(fmt_local == '')then
      select type(generic)
         type is (integer(kind=int8));     fmt_local='(i0,a)'
         type is (integer(kind=int16));    fmt_local='(i0,a)'
         type is (integer(kind=int32));    fmt_local='(i0,a)'
         type is (integer(kind=int64));    fmt_local='(i0,a)'
         type is (real(kind=real32));      fmt_local='(1pg0,a)'
         type is (real(kind=real64));      fmt_local='(1pg0,a)'
         type is (real(kind=real128));     fmt_local='(1pg0,a)'
         type is (logical);                fmt_local='(l1,a)'
         type is (character(len=*));       fmt_local='(a,a)'
         type is (complex);                fmt_local='("(",1pg0,",",1pg0,")",a)'
      end select
   else
      if(format(1:1) == '(')then
         fmt_local=format(:len_trim(format)-1)//',a)'
      else
         fmt_local='('//fmt_local//',a)'
      endif
   endif
   allocate(character(len=256) :: line) ! cannot currently write into allocatable variable
   ios=0
   select type(generic)
      type is (integer(kind=int8));     write(line,fmt_local,iostat=ios,iomsg=msg) generic,null
      type is (integer(kind=int16));    write(line,fmt_local,iostat=ios,iomsg=msg) generic,null
      type is (integer(kind=int32));    write(line,fmt_local,iostat=ios,iomsg=msg) generic,null
      type is (integer(kind=int64));    write(line,fmt_local,iostat=ios,iomsg=msg) generic,null
      type is (real(kind=real32));      write(line,fmt_local,iostat=ios,iomsg=msg) generic,null
      type is (real(kind=real64));      write(line,fmt_local,iostat=ios,iomsg=msg) generic,null
      type is (real(kind=real128));     write(line,fmt_local,iostat=ios,iomsg=msg) generic,null
      type is (logical);                write(line,fmt_local,iostat=ios,iomsg=msg) generic,null
      type is (character(len=*));       write(line,fmt_local,iostat=ios,iomsg=msg) generic,null
      type is (complex);                write(line,fmt_local,iostat=ios,iomsg=msg) generic,null
   end select
   if(ios /= 0)then
      line='<ERROR>'//trim(msg)
   else
      ilen=index(line,null,back=.true.)
      if(ilen == 0)ilen=len(line)
      line=line(:ilen-1)
   endif
end function fmt

end program main

So basically one can now do things like "some string".replace."some".with."A". It might not be super efficient but looks pretty neat. Maybe I am not the only one who thought of it. Has anyone else created such things? I’d be interested.

1 Like

https://urbanjost.github.io/M_strings/man3.html points to the M_strings__oop interface. I see it showing up twice in the (automatically built) index. I will fix that. I think string%substitute is similiar; I thought between M_strings and M_overload I had a number of string operators defined, but I do not see them; there are some but I may never have posted it. I think stdlib has some for the variable-length string type defined there. Maybe M_strings__oop should be a separate fpm project that uses M_strings as a dependency. It is not very visible as-is. If the official repository gets going I planned on cleaning up and removing some duplicates from M_strings and M_strings__oop but have not worked on that direction in quite a while. I do not know of any basic string operation M_expr, M_strings, M_overload and M_strings__oop has missing from what other language provide accept more expansive regular-expression support. There probably is some, but not any I use.

1 Like

I developed a small module for “positional” formatting - see io_utilities/src/positional_formatting.f90 at main · arjenmarkus/io_utilities · GitHub. It is not feature-complete, as seems to be the term, but it might be useful.

1 Like

Almost a natural language feel. I personally have not seen that approach before. The functionality is similar to the OOP approach in string%substitute but the more I use it the more this “operator” interface approach grows on me; sort of like functional programming.
Combined with the module below if I get some time it would be interesting to rework M_strings__oop.f90 so it can work with an intrinsic string via an OOP interface like this.

1 Like

That’d be cool!

I am now thinking example usage for the full replace feature:

line = line.replace."text".with."other text" ! Replace one instance
line = line.replace_all."text".with."other text" ! replace all instances
line = line.replace.3.times."text".with."other text" ! replace 3 instances
line = line.replace_reversed.3.times."text".with."other text" ! replace 3 instances starting from the end

Did I miss anything? I guess the approach is theoretically possible. It also has some more natural language feel, like you say. Somehow “Pythonic” even? :grin:

Edit. Allowing more composability by introducing unary .all. and .reversed.:

line = .all.(line.replace."text").with."other text" ! replace all instances
line = .reversed.(line.replace.3.times."text").with."other text" ! replace 3 instances starting from the end

But I still don’t see the point of unary operators, why not then just use the function directly:

line = all(line.replace."text").with."other text" ! replace all instances
line = reversed(line.replace.3.times."text").with."other text" ! replace 3 instances starting from the end

Edit 2.

Actually, it is probably better to have “all” as default, then we’d have:

line = line.replace."text".with."other text" ! replace all instances
line = line.replace.3.times."text".with."other text" ! replace 3 instances
line = reversed(line.replace.3.times."text").with."other text" ! replace 3 instances starting from the end