`stdlib` `string_type` `parameter` array currently possible?

Do you know if there is a way to currently define a parameter array of varying length strings using stdlib string_type?

The following fails

program main
  use stdlib_string_type
  implicit none

  type(string_type), parameter  :: str_param_array(*) = [ "foo", &
                                                          "barr", &
                                                          "bazzz", &
                                                          "quuxxx", &
                                                          "corgeee" ]

  print *, str_param_array
end program main

For example gfortran says

app/main.f90:5:64:

    5 |   type(string_type), parameter  :: str_param_array(*) = [ "foo", &
      |                                                                1
Error: Different CHARACTER lengths (3/4) in array constructor at (1)
compilation terminated due to -fmax-errors=1.
<ERROR> Compilation failed for object " app_main.f90.o "
<ERROR>stopping due to failed compilation
STOP 1

On the other hand, with a “normal” allocatable array

program main
  use stdlib_string_type
  implicit none

  type(string_type), allocatable :: str_param_array(:)

  str_param_array = [string_type("foo"), string_type("barr"), string_type("bazzz"), &
                      string_type("quuxxx"), string_type("corgeee")]
  print *, str_param_array
end program main

it works

$ fpm run 
 foo barr bazzz quuxxx corgeee

Is there a way to set a constant string_type array with varying length items (i.e. without padding with spaces)?

Thanks.

This is exactly the reason I’d like the new simple to be equivalent to constexpr. Or at least that simple functions be allowed in constant expressions. I want to be able to define parameters of derived types, even if it has allocatable or private components. If one could overload the structure constructor with a simple function and it were allowed in constant expressions, then you could.

Also related, if the iso_varying_string module had actually been adopted into the standard, that might have been a way to do it too. The var_str constructor could have been defined to be allowed in constant expressions, and that would have allowed for arrays of strings with varying lengths.

At present, there is no way to have a parameter with a list of strings of different lengths.

3 Likes

:frowning_face: thank you

“F2008 Standard does not allow CHARACTER(LEN=*) in array constructor”

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77506

2 Likes

There is the possibility of an approximation.

associate(my_string_list_parameter => [string_type("foo"), string_type("bar"), ...])
  ! my_string_list_parameter is a constant in here
end associate
2 Likes

Good suggestion, but unfortunately I need that array as a public, global entity of a module used in several places.

In theory, there might be a way to have it initialized before program start. I’ve heard it’s possible in C to execute code before main. Declare a static variable in a C file that is “initialized” by calling a private, bind(c) procedure in the module, that initializes your “constants”. Make the variables protected, instead of public, and then nothing can modify them. It’s hacky, and everybody who ever sees it will be completely baffled, but it should work.

1 Like

Yes, thanks for the hint, I might consider it. I vaguely remember that this strategy was (maybe) also used somewhere in stdlib but the details escape me now.

I think Fortran 2018 does, and both gfortran -std=f2018 and ifort -stand:f18 compile a run a code such as

    character(len=9), parameter :: dayname(7) = [character(len=9) :: &
      'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', &
      'Saturday', 'Sunday' ]
print*,dayname
end

for versions

GNU Fortran (GCC) 12.0.0 20210718
Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.1 Build 20201112_000000

This is valid code, however, the values in the array do all have length 9. All of the strings in that array are the same length, some of them are just automatically padded with “blanks”. I’ve got an example about why this might not be the best idea.

program main
  use iso_fortran_env, only: character_kinds
  implicit none

  integer, parameter :: c1 = character_kinds(1), c2 = character_kinds(2)
  character(len=5, kind = c1), parameter :: foo1 = "foo"
  character(len=5, kind = c2), parameter :: foo2 = "foo"

  print *, foo1 == foo2
  print *, foo1 == "foo"
  print *, foo2 == "foo"
end program

It is not defined whether this program will print T or F for any of the 3 expressions. Specifically, the blank character is only specified to be space for characters of default kind (i.e. the “foo” literals on the last 2 lines). The comparison operators say that the shorter character is padded with blanks to the length of the longer character, but those blanks could be different characters between different kinds. It is also not defined what the order of the kinds are in the character_kinds array, so either c1 or c2 or neither might be the default kind.

Granted I have not seen a processor that will actually print F for any of those, but it could and would still be considered standards conforming.

I am almost in disbelief that even something as simple as the following is not allowed

program main
  use stdlib_string_type
  implicit none

  type(string_type) :: foo = string_type("foo")
  type(string_type) :: bar = string_type("barr")
  type(string_type) :: baz = string_type("bazzz")
  type(string_type) :: quux = string_type("quuxxx")
  type(string_type) :: corge = string_type("corgeee")
  type(string_type) :: str_param_array(5) = [foo, bar, baz, quux, corge]

  print *, str_param_array
end program main

Looks like that without an executable statement (that I cannot use in the declarative section of a module) I cannot create the array. Frustrating.

1 Like

Sadly, we can’t have a constructor for the string_type that can be used to set a parameter value.

One issue is that we are providing a constructor for the string_type to overwrite the default one. This is mainly to keep the raw value safe from user access and to help handle some corner cases where unallocated deferred length character variables are passed to the default constructor.

We could make the default constructor accessible by making the raw value public and require users to promise to not touch it under any circumstances, but this won’t allow the usage of the default constructor in a parameter initialization, because the component we want to initialize is allocatable:

/home/awvwgk/projects/src/git/stdlib/_build/src/stdlib_string_type.f90:44:52:

   44 |     type(string_type), parameter :: a = string_type("test")
      |                                                    1
Error: Invalid initialization expression for ALLOCATABLE component ‘raw’ in structure constructor at (1)

Using the user-defined assignment to initialize a string_type from a fixed length character value is also not possible because the procedure is user-defined.

Surprisingly, implementing the string_type already pushed quite hard on the limits of what is currently possible in Fortran. For example, we don’t have an assignment(=) from string_type to character variables because this can’t be implemented for both fixed-length and deferred-length characters in a consistent way.

2 Likes

Just to make it crystal clear: I am very grateful for the excellent work on string_type in stdlib.

My frustration came from the realisation that the current limitations of the language are more problematic than I imagined. I would dare to say (although I cannot be totally sure) that any major language in use today does not limit the user to this extent for something as trivial as setting a constant list of strings with elements of different length.

2 Likes

Fully agree with this statement.

The most important lesson I learned from this is that user-defined assignment(=) and automatic reallocation of the LHS don’t work well together:

  • the assignment(=) is a subroutine and therefore requires explicit declaration of the the LHS
  • adding the allocatable attribute for the LHS limits the assignment(=) to LHS variables with allocatable attribute
  • leaving out the allocatable attribute allows only non-allocatable or already allocated LHS variables in the assignment(=)
  • this scenario is most relevant for deferred-length characters and deferred-length parametrized derived types

I wonder what would be the required change to allow users to define an assignment(=) operator that works seamlessly together with automatic LHS reallocation.

There are a couple of other things that bring interesting complications with implementing a fundamental data type like a string in Fortran. For example the user defined derived type input and output (UDDTIO) really made a lot of trouble and is still in a half-finished state for string_type.

3 Likes

All this pain with string_type stems from the fact that Fortran has not been designed to be able to implement user types that behave like intrinsic types, or at least work consistently with other language features such as =.

If we expose the internal string, wouldn’t that fix the problem? I thought you can have an array of different lengths strings as long as you keep the (allocatable) string as a component of a derived type and I thought that is precisely what string_type is for… So I am confused.

Making the internal representation public creates some interesting aliasing issues for at least the assignment(=) operator:

type(string_type) :: str
str = str%raw

Exposing the raw representation allows to hack around some issues, but it also introduces new ones. In particular, we lose the memory safety introduced by string_type as the raw value can always be unallocated.

Also, for the specific use case of using the default constructor, the allocatable attribute is the actual problem, which can’t be mended by making the raw representation public.

2 Likes

If it is without a constructor, would it then allow to create an array of strings? I still don’t understand what feature that we implemented blocks this use case. Yes, it might be that we will have to choose which nice features we can keep and which we cannot.

1 Like

Still doesn’t work for parameter initialization as I explained here:

It’s a limitation of the Standard not out implementation which disallows using it to initialize parameters. So far there seems to be no hack which we could use to work around the Standard limitations here.

5 Likes

Yes, you are completely correct. I suspect that this is exactly what happened when Stefano Zaghi made public the inner allocatable in StringiFor following my request and that introduced regressions like this one.

I believe that there are so many custom-made Fortran implementations of string arrays of varying length that at least might be possible to identify and pick the best design choices or, I’d better say, the less worse considering the limitations of the language.

1 Like

I should mention that I proposed a string_class for stdlib as well some time ago:

This implementation did allow libraries like stringiFor or ftlString to inherit from the same base class and subsequently make the seamlessly intercompatible with each other and stdlib’s string_type. But we dismissed the idea in favor of the string_type at this time. My repository with the implementation should be still around.

1 Like