`stdlib` `string_type` `parameter` array currently possible?

There is the possibility of an approximation.

associate(my_string_list_parameter => [string_type("foo"), string_type("bar"), ...])
  ! my_string_list_parameter is a constant in here
end associate
2 Likes

Good suggestion, but unfortunately I need that array as a public, global entity of a module used in several places.

In theory, there might be a way to have it initialized before program start. I’ve heard it’s possible in C to execute code before main. Declare a static variable in a C file that is “initialized” by calling a private, bind(c) procedure in the module, that initializes your “constants”. Make the variables protected, instead of public, and then nothing can modify them. It’s hacky, and everybody who ever sees it will be completely baffled, but it should work.

1 Like

Yes, thanks for the hint, I might consider it. I vaguely remember that this strategy was (maybe) also used somewhere in stdlib but the details escape me now.

I think Fortran 2018 does, and both gfortran -std=f2018 and ifort -stand:f18 compile a run a code such as

    character(len=9), parameter :: dayname(7) = [character(len=9) :: &
      'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', &
      'Saturday', 'Sunday' ]
print*,dayname
end

for versions

GNU Fortran (GCC) 12.0.0 20210718
Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.1 Build 20201112_000000

This is valid code, however, the values in the array do all have length 9. All of the strings in that array are the same length, some of them are just automatically padded with “blanks”. I’ve got an example about why this might not be the best idea.

program main
  use iso_fortran_env, only: character_kinds
  implicit none

  integer, parameter :: c1 = character_kinds(1), c2 = character_kinds(2)
  character(len=5, kind = c1), parameter :: foo1 = "foo"
  character(len=5, kind = c2), parameter :: foo2 = "foo"

  print *, foo1 == foo2
  print *, foo1 == "foo"
  print *, foo2 == "foo"
end program

It is not defined whether this program will print T or F for any of the 3 expressions. Specifically, the blank character is only specified to be space for characters of default kind (i.e. the “foo” literals on the last 2 lines). The comparison operators say that the shorter character is padded with blanks to the length of the longer character, but those blanks could be different characters between different kinds. It is also not defined what the order of the kinds are in the character_kinds array, so either c1 or c2 or neither might be the default kind.

Granted I have not seen a processor that will actually print F for any of those, but it could and would still be considered standards conforming.

I am almost in disbelief that even something as simple as the following is not allowed

program main
  use stdlib_string_type
  implicit none

  type(string_type) :: foo = string_type("foo")
  type(string_type) :: bar = string_type("barr")
  type(string_type) :: baz = string_type("bazzz")
  type(string_type) :: quux = string_type("quuxxx")
  type(string_type) :: corge = string_type("corgeee")
  type(string_type) :: str_param_array(5) = [foo, bar, baz, quux, corge]

  print *, str_param_array
end program main

Looks like that without an executable statement (that I cannot use in the declarative section of a module) I cannot create the array. Frustrating.

1 Like

Sadly, we can’t have a constructor for the string_type that can be used to set a parameter value.

One issue is that we are providing a constructor for the string_type to overwrite the default one. This is mainly to keep the raw value safe from user access and to help handle some corner cases where unallocated deferred length character variables are passed to the default constructor.

We could make the default constructor accessible by making the raw value public and require users to promise to not touch it under any circumstances, but this won’t allow the usage of the default constructor in a parameter initialization, because the component we want to initialize is allocatable:

/home/awvwgk/projects/src/git/stdlib/_build/src/stdlib_string_type.f90:44:52:

   44 |     type(string_type), parameter :: a = string_type("test")
      |                                                    1
Error: Invalid initialization expression for ALLOCATABLE component ‘raw’ in structure constructor at (1)

Using the user-defined assignment to initialize a string_type from a fixed length character value is also not possible because the procedure is user-defined.

Surprisingly, implementing the string_type already pushed quite hard on the limits of what is currently possible in Fortran. For example, we don’t have an assignment(=) from string_type to character variables because this can’t be implemented for both fixed-length and deferred-length characters in a consistent way.

2 Likes

Just to make it crystal clear: I am very grateful for the excellent work on string_type in stdlib.

My frustration came from the realisation that the current limitations of the language are more problematic than I imagined. I would dare to say (although I cannot be totally sure) that any major language in use today does not limit the user to this extent for something as trivial as setting a constant list of strings with elements of different length.

2 Likes

Fully agree with this statement.

The most important lesson I learned from this is that user-defined assignment(=) and automatic reallocation of the LHS don’t work well together:

  • the assignment(=) is a subroutine and therefore requires explicit declaration of the the LHS
  • adding the allocatable attribute for the LHS limits the assignment(=) to LHS variables with allocatable attribute
  • leaving out the allocatable attribute allows only non-allocatable or already allocated LHS variables in the assignment(=)
  • this scenario is most relevant for deferred-length characters and deferred-length parametrized derived types

I wonder what would be the required change to allow users to define an assignment(=) operator that works seamlessly together with automatic LHS reallocation.

There are a couple of other things that bring interesting complications with implementing a fundamental data type like a string in Fortran. For example the user defined derived type input and output (UDDTIO) really made a lot of trouble and is still in a half-finished state for string_type.

3 Likes

All this pain with string_type stems from the fact that Fortran has not been designed to be able to implement user types that behave like intrinsic types, or at least work consistently with other language features such as =.

If we expose the internal string, wouldn’t that fix the problem? I thought you can have an array of different lengths strings as long as you keep the (allocatable) string as a component of a derived type and I thought that is precisely what string_type is for… So I am confused.

Making the internal representation public creates some interesting aliasing issues for at least the assignment(=) operator:

type(string_type) :: str
str = str%raw

Exposing the raw representation allows to hack around some issues, but it also introduces new ones. In particular, we lose the memory safety introduced by string_type as the raw value can always be unallocated.

Also, for the specific use case of using the default constructor, the allocatable attribute is the actual problem, which can’t be mended by making the raw representation public.

2 Likes

If it is without a constructor, would it then allow to create an array of strings? I still don’t understand what feature that we implemented blocks this use case. Yes, it might be that we will have to choose which nice features we can keep and which we cannot.

1 Like

Still doesn’t work for parameter initialization as I explained here:

It’s a limitation of the Standard not out implementation which disallows using it to initialize parameters. So far there seems to be no hack which we could use to work around the Standard limitations here.

5 Likes

Yes, you are completely correct. I suspect that this is exactly what happened when Stefano Zaghi made public the inner allocatable in StringiFor following my request and that introduced regressions like this one.

I believe that there are so many custom-made Fortran implementations of string arrays of varying length that at least might be possible to identify and pick the best design choices or, I’d better say, the less worse considering the limitations of the language.

1 Like

I should mention that I proposed a string_class for stdlib as well some time ago:

This implementation did allow libraries like stringiFor or ftlString to inherit from the same base class and subsequently make the seamlessly intercompatible with each other and stdlib’s string_type. But we dismissed the idea in favor of the string_type at this time. My repository with the implementation should be still around.

1 Like

Do you think resorting to a preprocessor could resolve my problem and define a constant literal linked to an array of strings with items of different length?

10.1.5.1 Intrinsic operation classification, para 6, line 27-28:

The kind type parameters of the operands of a character relational intrinsic operation shall be the same.

Line 9 is always wrong. Depending on whether c1 or c2 corresponds to default kind, Line 10 or 11 will be wrong.

I stand corrected. Thanks for the info. I had just assumed that because 1.0e0 == 1.0d0 was valid, something similar would be for characters as well. But once again we all see what happens when you assume…

1 Like

Welcome to the club (of the unfortunate :sob: )

See my comments in this thread, especially " How easy and convenient is to develop libraries in Fortran that can used effectively and productively by the broader scientific and technical community? In year 2020, the reality is it is extremely difficult, impossible even in some situations. This extends to both standard libraries as well as other community / domain-specific libraries. The issues here are another indicators of where the language needs to improve."

Add the discussion here with what should be trivially simple string_type development in a modern language to the issues I mentioned in the thread above.

We, in industry, realized the limitations mid to late 2000s given so many loose-ends in the 2003 edition.
Though Fortran 2008 gave some hope, the next revision 2018 nearly a decade later and the plan toward 202X has proved to be so disappointing to many powers-that-be that so much of technical and scientific computing has migrated away.

As I have commented on this forum as well as others (comp.lang.fortran, GitHub j3-fortran site, Intel Forum, etc.) I am of the firm view the limitations with the language are entirely borne out of aspects missing or the baggage persisting in the standard.

3 Likes