New Release of iso_varying_string

Hi All,

I just produced a new major version release of my iso_varying_string module implementation. This comes with

  • Enable use of varying_string in read, write, and print statements.
  • Add ford documentation
  • Make var_str a generic interface, so converting to other conversion to varying_string procedures can be provided

I’m hoping the documentation is helpful for new users, but if anybody could take a look and give feedback I’d appreciate it. Also, I’ve tested the formatted read/write procedures to an extent, which should make things more convenient and familiar when doing I/O, but if anybody uses unformatted read/write and could test that out, or if you find any surprising behavior with more sophisticated use case of formatted read/write, I’d appreciate any feedback.

18 Likes

What are the advantages over allocatable characters?

You can have an array of strings of different sizes. This consequently means that you can have elemental procedures with them as output. And because you no longer need to worry about triming every string, you can actually see when they have trailing spaces.

8 Likes

This is excellent work, thank you for sharing it with the community!

I also have a type(string) implementation: it does not mimic the iso_varying_string from the standard, but so far I’ve been restrained from extending its usage much, mainly due to issues with gfortran which is a staple of my building environment.

The one issue I had is that I’ve got to use character(len=1), allocatable :: text(:) instead of character(len=:), allocatable :: text due to issues with some releases. I see you’re using the same approach here.

Also, I have tried using derived type IO, but I’ve found:

  • the standard has a limitation that you cannot define (A) format in a derived type, so, replacing charactervariables with strings won’t be necessarily straightforward;
  • gfortran has an issue with automatic arrays of strings, which seems to be true also with your module: for example
module use_iso
        use iso_varying_string
        implicit none
        contains

        subroutine do_something(n,msg)
           integer, intent(in) :: n
           type(varying_string) :: s(n) ! automatic array: ICE
           character(*), intent(in) :: msg
           integer :: i
           do i=1,n
              s(i) = msg
              write(*,*) s(i)
           end do
        end subroutine do_something
end module
program test_iso
        use iso_varying_string
        use use_iso
        implicit none
        call do_something(10,"HELLO")
end program

I would like to help contributing to gfortran but I haven’t had luck building it with clang on a Mac M1 yet, I will keep on trying though.

2 Likes

Yeah, I tried that out too, but oh well. Hopefully it won’t trip users up too bad. Documentation could help with that.

I actually just encountered that yesterday. I can say that it does actually work if you don’t have type-bound procedures, because until I released this version I did have some cases of that usage. NOTE though, that isn’t a static array, nor a statically sized array. It’s called an “automatic” array, its size depends on a procedure argument. I’m planning to work up a minimal example for a bug report in the next few days (unless there already is one).

@everythingfunctional , @FedericoPerini , and readers who work a lot with gfortran,

See this thread and this one at comp.lang.fortran; notice how Thomas Koenig was interested in a Fortran implementation of multiple-precision floating-point reliable library, ran into some issues with finalization in gfortran, and has now started some work toward enhancing the finalization facilities per the Fortran standard in gfortran.

There may be something to learn from it: about how users of gfortran can become developers of gfortran and they get to benefit immediately while also helping other users.

That suggests again resurrecting some previous efforts led by @rouson with Sourcery, Inc. which is about aiding individual efforts with gfortran with some institutional assistance toward furthering gfortran advancement:

  • Thomas Koenig is again taking up some effort, perhaps there are avenues to collaborate with Thomas and do some knowledge gathering to train other gfortran users on how to do gfortran enhancements, perhaps create training material and workflow documentation that might make it much easier to encourage more and more gfortran users to contribute to gfortran?
  • Similar to Sourcery Inc. previously, perhaps Archaelogic Inc. and/or the Community here at Fortran Discourse where many readers like @FedericoPerini are strong users of gfortran, can develop more means to accelerate gfortran enhancements. @rouson may be able to give guidance on this, @everythingfunctional may be aware of Damian’s other such effort with gfortran beyond those in the periphery of OpenCoarrays.

Just sharing a few thoughts,

1 Like

@everythingfunctional ,

Great work.

Just a couple of suggestions:

  1. consider modifying the API for the SPLIT generic interface to match the same-named new intrinsic in Fortran 202X?
  2. consider adding support for TOKENIZE to this library along the same lines as the new intrinsic in Fortran 202X?

As noted in this thread, this is in effect now a user library that indeed employs “allocatable characters” and so from some of the standard bearers’ point-of-view, it is just making use of the base language.

Note though the advantages with the derived type mentioned in the original post with the use case of “jagged” arrays of CHARACTER type and their use with ELEMENTAL subprograms do come at a cost because the practitioners are inclined to seek an equivalent of the intrinsic type whereas a user derived type, given the limitations of the Fortran standard, simply cannot do all that intrinsic types have been extended to provide over the years.

With the now deleted part of the ISO IEC standard, ISO_VARYING_STRING has no official status to be extended. And as things stand, certain users will be disappointed it’s not the same as working with CHARACTER type - some immediate examples:

   character(len=:), allocatable :: s
   s = "Hello World!"
   print *, "s%len = ", s%len, "; s%kind = ", s%kind
   print "(*(g0))", "s(1:3) = ", s(1:3) 
end 

versus

   use iso_varying_string
   type(varying_string) :: s
   s = "Hello World!"
   print *, "s%len = ", s%len, "; s%kind = ", s%kind !<-- won't work as defined now
   print "(*(g0))", "s(1:3) = ", s(1:3) !<-- won't work
end 

To me, all this shows an urgent need for a new intrinsic type of string in the Fortran standard that serves the practitioners of Fortran with all the inherent benefits and which then obviates the need for all such user libraries. It is a relatively minor effort for the standard to be extended but, my gosh, it is the hardest thing to convince the vendors to lend support for this. The above trivial example with the substring referencing alone should suffice; alas that is not the case.

The situation with the lack of an intrinsic string type is yet another data point that makes me wonder, For whom Fortran, for what?.

4 Likes

I may some day do so, but as you mention, Sourcery (now Archaeologic) has so far been able to find (and pay as appropriate) prior contributors in cases of significant urgency or importance. This has proven cheaper and easier than becoming direct contributors ourselves. We’re happy to continue the support, but direct contribution has (unfortunately) not made it to the top of our priorities.

If I’m not mistaken the API is actually relatively similar to that proposed for 202X. I’ve not been particularly happy with that API myself, but I’ll double check and see what would be prudent.

This is certainly worth considering. I’ll take a look at the interfaces and see what would make sense.

Honestly, the answer to both of these questions is that I would love to (if I had the time) update the document and re-introduce it for inclusion in F202Y. If anybody would be willing to champion it I would be happy to provide guidance and assistance. As you’ve noted in your other comment, there are a couple of features that really would be nice that could only be accomplished by implementation in a compiler. I wasn’t aware %len or %kind were possible with intrinsic characters. They’d actually be doable with my implementation. The slice notation and intrinsic format specifiers however would be really nice and can only be supported by a compiler.

Right, sorry I’ve written up my post too fast - I tend to be puzzled by all of this nomenclature!

Regarding gfortran, I’ve filed a bug here in case you have further examples/edge cases to add. It’s my understanding that each of such bug reports eventually ends up in a unit test, so the more the better.

Also note that the DTIO issue with automatic arrays is not specific to character variables but it affects any derived types with type-bound DTIO

Finding myself in the same process to decide whether replacing character variables with a derived type, I’ve found that:

  • this type of issues is relatively easily addressed with an IDE: you just need to replace s%len with s%len() and define a function that’s

type :: string
  ![...]
  procedure :: len => string_len
end type string
![...]
elemental integer function string_len(this)
   class(string), intent(in) :: this
   string_len = merge(len(this%raw),0,allocated(this%raw))
end function string_len
  • (IMHO) It is less straightforward to find good usage of DTIO that doesn’t mean replacing all character outputs with (DT).Would have been far better if the standard allowed DTIO to also extend the intrinsic type, e.g., (A) for “write as character”, (fn.w) for “write as float”, etc.
    With gfortran, if any of such labels is used for DTIO, nothing will be printed out, which I believe is the correct way of doing it form the standard standpoint, but harder to debug.
  • Another very useful feature of a user-defined string would have been to use DTIO to do advancing and non-advancing list read. That would have meant using the same facilities to either read in from a file, or from a string that contains the whole file (faster, I bet). I don’t think that’s ever going to be possible.

No, if one were to go by the standard semantics, the closest analog to the CHARACTER intrinsic type will be a parameterized derived type with a kind-type parameter named KIND itself and a length-type parameter named LEN:

..
type :: string_t(kind, len)
   integer, kind :: kind
   integer, len :: len
   private
   character(kind=kind, len=len) :: m_s
contains
..

Is there a proposal for an intrinsic string type somewhere?

Nice, I hadn’t thought about it… that would be a very elegant way to have a variable-length string type. The only limitation is like mentioned, that you couldn’t have an array of strings with different length/kind.

I have little practical experience in parameterized derived types, cause I’ve always assumed compilers weren’t ready for them yet. Maybe I should give them more attention though.

Perfect. Thanks.

1 Like

I don’t think you actually need for them to be type-bound procedures. You could make them components of the type and define them any time the object is (re)defined. And once we have protected components you can even ensure users don’t modify them directly and inadvertently get inconsistent state.

1 Like

While one would initially be tempted to try this, you can’t have an array of derived types with different values for the length-type parameter, and you’re right back to the problem this module is trying to solve in the first place.

1 Like

The document on which I based this module was exactly that. It was unfortunately abandoned. From what I’ve heard (but not been able to confirm) those on the committee believed at the time that character(len=:), allocatable variables sufficiently addressed all needs without the necessity of a new type. I now believe they were wrong and that this document should be reintroduced, but don’t have time to champion it myself as I’m focusing my efforts on templates.

I thought @FortranFan was suggesting a new intrinsic type for string (i.e. in addition to logical, integer, real, complex, and character), rather than a new intrinsic module containing a derived type for string.

Is there any technical benefit to making the string module be intrinsic rather than part of a standard library. Efficiency gains or something?

1 Like