How do i allocate an array of strings?

@FortranFan , I understand you disagree with decisions made by the committee and that is frustrating. I often disagree with some of the decisions, and also find it frustrating. But this excessive hostility is unhelpful at best, and frequently provides misleading information at worst.

Just because we don’t always prioritise what you would like us to does not mean we don’t prioritise users. We have limited resources and have to prioritise between various users. I’m sorry that’s frustrating, but you could try and contribute instead of just complaining.

Plenary is not intended to be the time when language design work happens. Plenary is the time subgroup and individual work is presented to the rest of the committee for general discussion and voting. The real work is supposed to happen elsewhere.

This is simply untrue. Anyone is welcome to join the committee. We encourage it. And even if you don’t want to officially become a committee member, we are happy to receive, and frequently ask for, outside contributions.

To anyone who is frustrated with what the committee prioritises or designs, I encourage you to try your hand at doing the work to officially propose what you want to the committee. I’m happy to help with the logistics of submitting papers and presenting to the committee for anyone who would be interested in trying it. I don’t have the bandwidth to do the design and authoring of papers for everything though. This entails identifying the exact language that needs to be added/changed in the standard and trying to ensure that it doesn’t result in any inconsistencies in the existing features of the language.

What I don’t appreciate is complaints about the committee with no real solutions presented. If you want the committee members to do the work for you, you have to convince us that your idea is worth prioritising over other work, and that also takes more than just complaints.

5 Likes

This is as much for all the Fortranners who care about a language that evolves at a good pace both with general advances in what is veritably the third leg of science - computational to complement theory and experiment - and with their needs in the practice of Fortran in whatever endeavors they purse, as it is for the standard committee members including @everythingfunctional . The point is this:

The “TYPE” keyword for an intrinsic type with the concomitant parenthesis is absolutely redundant (noise, useless, etc.) Period.

Consider a Fortran 2008 conformant processor: one can do

   type(character(len=:)), allocatable :: s
   s = "Hello World!"
   print *, "String length of s: ", s%len
end
C:\temp>gfortran -ffree-form p.f -o p.exe

C:\temp>p.exe
 String length of s:           12

So here the intrinsic of CHARACTER behaves like a derived type and to a casual reader it even appears as if it were a derived type, but yet it is not.

And the TYPE keyword and the parenthesis are unnecessary as one can notice in countless instances of actual practice where few, if any, practitioners employ such a thing. Now I can see autogenerated code doing such a thing but even that is quite outdated given the capabilities advancing in the realm of generative AI.

This is why in the recommendation in this link I suggest a string intrinsic type be introduced that includes certain facilities and features for the best interests and conveniences of the practitioner which also cover aspects that make it appear as if the type were a derived type.

However there is absolutely no need for the standard to make such a distinction in the standard text with a derived type for this new string type and then seek to IMPOSE syntax and semantics which are for derived types. And yet, just as with %len for CHARACTER type and %IM and %RE with COMPLEX types, the language must include certain convenience aspects for the practitioner with this new STRING type.

But anyways, it’s all a pipe dream anyway from an ISO IEC standard point-of-view. This ain’t gonna make it to the 202Y worklist and given how much one alternate member on the committee whose vendor employer itself is not on the committee and doesn’t sponsor it, and possibly other silent members, oppose such a new type in the language standard, the discussions are more for any young group of Fortran enthusiasts who finally see the light and go ahead and move beyond the standard process and implement something intrinsically in a Community Fortran processor, say LFortran.

It’s remarkable how the slightest details can become highly controversial points without warning… Now type() is the new stumbling block of Fortran… What’s next? call ?

2 Likes

I have proposed the following before (and have been met with resonding silence) but I’ll repeat it again as an alternative to a derived type for an array of variable length strings.

Use the following syntax that basically replaces parens with curly brackets to signal the compiler that the object isnt a traditional array and to signal a constructor that has variable length data

character(LEN=:), allocatable :: strings{:}

allocate (strings(3))
strings = {"a", "a string", "a much longer string"}

Supplement that with an expanded set of string/character utiliites, and I think you can end up with something that is still “Fortranic” in its syntax and enables most of what I as a user needs from an array of strings. Basically you are just replacing parens with brackets and from that point everything (for the most part) functions as a standard array.

I also think that the same syntax could be used for an array of pointers but that will take a little more head scratching to figure out what the “gotchas” might be.

Edit:

You can also expand this to a formal string type (if such a thing were added to the standard) where

character(LEN=:), allocatable :: strings{:} 

could be replaced with just

string, allocatable :: strings{:}

ie

string

implies

character(LEN=:)
2 Likes

Michael Metcalf spoke of a string type in his essay Has Fortran a Future? (published in 1984). Here is the excerpt:


In 2015, Clive Page published a nice summary of his thoughts on String Handling in Fortran: https://fortran.bcs.org/2015/suggestion_string_handling.pdf

A few of his thoughts at the time:

At present the programmer has three options for handling character strings:
[…]
In my opinion none of them provides all the facilities that are really desirable.

So: what should we teach new users of Fortran about handling strings? Should they learn about all of these facilities, or only some of them? I really don’t know.

I think we need a fully-dynamic character string type, such that when you have an array of them, each element can have its own length, and that the string length should be set or re-set in all situations when the string value is modified.


Anyways, the Fortran-Lang community has invested a lot of effort, including two GSOC students into providing string handling facilities in stdlib. Here’s a little demo:

Make your new project:

$ fpm new string_demo
$ cd string_demo

Modify the fpm.toml to include stdlib as a dependency:

[dependencies]
stdlib = { git="https://github.com/fortran-lang/stdlib", branch="stdlib-fpm" }

Start using the stdlib string type:

program main
  use stdlib_string_type, str => string_type
  implicit none
  type(str) :: a(5)
  integer :: i 

  a = [ str('mary'), str('had'), str('a'), str('little'), str('lamb')]
  
  ! Watch out, derived-type IO is needed!!! (there may be bugs!)
  write(*,'(*(DT))') a // ' '

  ! Alternative is to convert to character first
  write(*,'(*(A))') (char(a(i)) // ' ', i = 1, size(a))

  ! The `char` function is not elemental, because it would need to
  ! pick a fixed-length.

end program main
~/fortran/string_demo$ fpm run
main.f90                               done.
string_demo                            done.
[100%] Project compiled successfully.
mary had a little lamb 
mary had a little lamb 

(The compiler I’m using is GNU Fortran 13.2.0.)

1 Like

A minor improvement in the usability of the stdlib string_type would be to have a rule that the structure constructor would be called in array construction:

  a = [ str('mary'), str('had'), str('a'), str('little'), str('lamb')]
  
  ! vs
  
  a = [ str :: 'mary', 'had', 'a', 'little', 'lamb']

Currently, the language appears to forbid this:

$ fpm run
main.f90                               failed.
[ 50%] Compiling...
app/main.f90:12:14:

   12 |   a = [ str :: 'mary', 'had', 'a', 'little', 'lamb']
      |              1
Error: Cannot convert CHARACTER(4) to TYPE(string_type) at (1)

This is kind of inconsistent to me, given that the string_type provides an assignment operator, meaning the following is allowed:

type(string_type) :: a(2), b

b = 'assign'  ! OK
a = [ str :: 'assignment', 'test'] ! NOT ALLOWED
a = [ str :: str('assignment'), str('test')] ! ALLOWED

In addition, if we could use A instead of DT for formatting, we’d be almost where we’d like to be.

Addendum: at least for the built in-types and conversions array construction works:

  type(integer) :: c(3)
  c = [integer :: 1, 5.0, 6.0d0]
  print *, c 
  end

What would be necessary language-wise to introduce a promotion-rule from character(len=*) to string_type in array initialization?

3 Likes

This ain’t any “stumbling block” , this is just discourse!!

But the point remains a committee for Fortran with voting members who are mostly vendor reps should be more than willing to consider the semantics and syntax that do not impose any added verbosity on the practitioners.

This is a most basic consideration.

There is absolutely no good technical reason to impose the type(..) on Fortranners but should an intrinsic string type ever make it to the standard (and big if in our lifetime), you can bet your last dollar this will be present.

I am only bringing this up now so that Fortranners can be completely aware this was recognized well before the committee had ever voted to work on this which may not before Fortran 204Z!!!

This is not a “minor improvement”! Rather this structure construction toward arrays shall be a feature requirement.

Again and again, the question comes down to “For whom Fortran, for what?”

A string type is so basic, so intrinsic to any human endeavor for which Fortran can serve well that to consume such a type, a Fortranner should not have to

  1. Employ the USE statement,
  2. Carry the risk the Fortran processor may not support the stdlib which is a distinct possibility.

The Fortran-lang stdlib string_type is in some sense not far from an intrinsic type. It supports the same intrinsic functions and operators.

In the footsteps of the early C++, one could build a demo implementation of a new intrinsic string type using a preprocessing step to the stdlib string type:

string :: a(5)
! becomes
type(string) :: a(5)

a = [string :: 'one', 'two', 'three']
! becomes
a = [string('one'), string('two'), string('three')]

print *, a(3)(1:2)   ! or a(3)%c(1:2)
! becomes
print *, char(a(3),1,2) ! range variant

print *, a(2)(1)
! becomes
print *, char(a(2),1) ! position variant

print *, a
! becomes
print *, (char(a(i)), integer :: i = 1, size(a))
1 Like

@ivanpribec Isn’t the stdlib_string_type competing with iso_varying_string ?

Given the lack of human resources to work in the committee I am somehow glad they do not waste their time by working on changes (such as suppressing type(), call , or whatever) that will not add a single new feature to the langage.

When we started stdlib, string handling was one of the painful points we wanted to resolve. If you look at one of the original issues (https://github.com/fortran-lang/stdlib/issues/69) we found there was tons of prior art and reinvention going on.

The stdlib string_type comes close to the former varying_string proposal. The patch which introduced it was this one: https://github.com/fortran-lang/stdlib/pull/320. The differences are summarized in this post by @awvwgk :

  • there is no assignment from string to character
    • reason: there can be no assignment defined which covers both fixed length characters and deferred length characters as LHS
  • all procedures return a fixed length character rather than a string instance
    • reason: returning a derived type makes the handling of string types more involved, instead the fixed length character is converted back to a string type by assignment
    • drawback: assigning the return value to a string might create a temporary variable on the stack
  • no support for get and put
    • reason: derived type IO is used instead

The second bullet had to be relaxed for some reason, so now the functions operating on string_type also return a string_type:

  type(str) :: b
  character(len=:), allocatable :: bc

  b = repeat(b,2)        ! OK
  bc = repeat(b,2)       ! NOT OK
  bc = char(repeat(b,2)) ! OK

In addition, Sebastian Ehlert (@awvwgk) wrote a proof-of-concept abstract string base class (GitHub - awvwgk/stdlib_string: Exploration of string support for the Fortran standard library (stdlib)) and prepared versions of both the FTL library ftlString class and StringiFor string class that inherited from the abstract base class.

Personally, I prefer the functional string type, compared to the “heavy” object-oriented one.

1 Like

Yes it is.

But so does every single derived type in the spirit of the ones shown by @rwmsu, @RonShepard, or even Michael Metcalf in 1984:

! @rwmsu's string
Type DLstring_t
   Character(LEN=:), allocatable, :: aString
End Type

! Ron's string type
type string_type
   character(:), allocatable :: s
end type string_type

! Metcalf's string
TYPE String(Maxlen)
  INTEGER :: Length
  CHARACTER(LEN=Maxlen) :: String_data
END TYPE String

I bet almost any seasoned Fortran programmer has used his own one at some point.

But for serious work it’s not just the basic type which is needed. You need all the procedures that belong together with it and are a pain to write and debug. On top of the procedures you need the documentation (personally, I’m not yet satisfied with our stdlib one) to go along with them and the instructions how to build the software using a plethora of different build systems.

All of these problems go away when something is shipped with the compiler.

Addendum: Essentially we encounter the same problem as Tim Mattson spoke of for parallel programming languages:

  • the standards community has failed us by not standardizing a string type (be it intrinsic - better, or derived - worse); if it didn’t pass in 2000 why would it pass now?
  • as the application developer community we have failed ourselves, by not joining forces earlier and figuring out ways to reliably distribute high-quality Fortran modules including for strings

So the two solutions to the problem are either stronger committee involvement and pressure on the compiler vendors, or stronger collaboration in the Fortran community. Next time someone asks how to get an array of strings, recommend them a central solution like stdlib and not your own homebrew.

1 Like

But these latter ones are not meant to be shipped with the compilers. Both iso_varying_string and stdlib aim at being shipped with the compilers, so we’ll be in the strange situation having two competing “quasi-standard” string types, after decades without a single one (and even stranger if the “least official one” is more advanced than the brand new “most official one” ). Hence the question “which one should I recommend” :slight_smile: ?

IMO, the iso_varying_string should drop the iso prefix until it actually becomes part of the standard.

The aim of stdlib is to become a de facto standard. In the past I have remarked that it would be nice if vendors showed interest of packaging it in their compiler distributions, but so far I’ve seen almost no sign of interest.

I believe @certik has previously suggested making the stdlib modules part of LFortran, but this would not make them part of the Fortran specification. Quoting section 14.2.1 of J3/18-007r1

A module that is provided as an inherent part of the processor is an intrinsic module. A nonintrinsic module is defined by a module program unit or a means other than Fortran.

Procedures and types defined in an intrinsic module are not themselves intrinsic.

I haven’t used the varying_string module to be able to judge fully, but I understand the main differences is it provides put and get procedures, which we rejected in stdlib in favor of derived-type I/O, and it provides an additional assignment operator from varying_string to character, whereas in stdlib we opted for an explicit conversion using a function named char. (@everythingfunctional, please correct me if I’m wrong.)

Were the string_type to ever become standard (either as an intrinsic type, or a modified varying_string proposal) I believe we could deprecate the stdlib one and amend our documentation to suggest using the one that is part of the standard. But in principle it would remain viable forever as it’s just a non-intrinsic user derived type.

1 Like

This iso_varying_string thingy has no official status whatsoever.

There used to be a Part 2 of the ISO IEC 1539 standard publication but that has long been deleted (the base language Fortran is Part 1 i.e., 1539-1).

I don’t understand why that other poster attaches *official anything ( “quasi-standard”, “most official”!!) to iso_varying_string!!

As you indicate, perhaps the ISO prefix is lending it some “status” that’s otherwise absent altogether.

1 Like

Even in Fortran-Lang we don’t practice dogfooding to the extent we should. The Fortran package manager also has it’s own string type and accompanying routines:

Since fpm is also a fpm package, in principle you can use it too by including it as a dependency (but don’t do this please!):

[dependencies]
fpm.git = "https://github.com/fortran-lang/fpm"

I believe the reason is just historical, when fpm was under initial development, either the string_type was not there yet, or stdlib was not available as a fpm package yet (chicken or the egg dilemma).

1 Like

“that other poster” has a name/pseudo, and it ain’t “the one that must not be named”

Let’s see… the “iso” prefix, maybe? If this module is no longer specified at the ISO level, then indeed the prefix should be removed. By the meantime you cannot expect people to guess or be surprised if some people are misleaded by the prefix.

1 Like

You’re correct. Only a nuance to point out, that iso_varying_string also overloads char function if that is desired.

While it was never officially made part of the Fortran language, it did get close, and is listed by ISO: ISO/IEC 1539-2:2000 - Information technology — Programming languages — Fortran — Part 2: Varying length character strings

From what I have heard, the only reason it did not get officially included in to the standard is because it was thought that deferred length allocatable character variables were sufficient. Clearly that’s not quite true or we wouldn’t all be asking for it now.

Just a point of information: The current roster of voting members is 5 vendors (AMD, Arm, HPE, IBM, Nvidia), 7 user organizations (Argonne*, Lawrence Berkeley, Lawrence Livermore, Los Alamos, NASA, NCAR, Oak Ridge), and two people formerly associated with vendors, but now “independent”(Lionel, me).

  • Argonne is advisory until they attend their second meeting. We encourage them to do so.
2 Likes