Do you know if there is a way to currently define a parameter array of varying length strings using stdlibstring_type?
The following fails
program main
use stdlib_string_type
implicit none
type(string_type), parameter :: str_param_array(*) = [ "foo", &
"barr", &
"bazzz", &
"quuxxx", &
"corgeee" ]
print *, str_param_array
end program main
For example gfortran says
app/main.f90:5:64:
5 | type(string_type), parameter :: str_param_array(*) = [ "foo", &
| 1
Error: Different CHARACTER lengths (3/4) in array constructor at (1)
compilation terminated due to -fmax-errors=1.
<ERROR> Compilation failed for object " app_main.f90.o "
<ERROR>stopping due to failed compilation
STOP 1
On the other hand, with a ânormalâ allocatable array
program main
use stdlib_string_type
implicit none
type(string_type), allocatable :: str_param_array(:)
str_param_array = [string_type("foo"), string_type("barr"), string_type("bazzz"), &
string_type("quuxxx"), string_type("corgeee")]
print *, str_param_array
end program main
it works
$ fpm run
foo barr bazzz quuxxx corgeee
Is there a way to set a constant string_type array with varying length items (i.e. without padding with spaces)?
This is exactly the reason Iâd like the new simple to be equivalent to constexpr. Or at least that simple functions be allowed in constant expressions. I want to be able to define parameters of derived types, even if it has allocatable or private components. If one could overload the structure constructor with a simple function and it were allowed in constant expressions, then you could.
Also related, if the iso_varying_string module had actually been adopted into the standard, that might have been a way to do it too. The var_str constructor could have been defined to be allowed in constant expressions, and that would have allowed for arrays of strings with varying lengths.
At present, there is no way to have a parameter with a list of strings of different lengths.
In theory, there might be a way to have it initialized before program start. Iâve heard itâs possible in C to execute code before main. Declare a static variable in a C file that is âinitializedâ by calling a private, bind(c) procedure in the module, that initializes your âconstantsâ. Make the variables protected, instead of public, and then nothing can modify them. Itâs hacky, and everybody who ever sees it will be completely baffled, but it should work.
Yes, thanks for the hint, I might consider it. I vaguely remember that this strategy was (maybe) also used somewhere in stdlib but the details escape me now.
This is valid code, however, the values in the array do all have length 9. All of the strings in that array are the same length, some of them are just automatically padded with âblanksâ. Iâve got an example about why this might not be the best idea.
It is not defined whether this program will print T or F for any of the 3 expressions. Specifically, the blank character is only specified to be space for characters of default kind (i.e. the âfooâ literals on the last 2 lines). The comparison operators say that the shorter character is padded with blanks to the length of the longer character, but those blanks could be different characters between different kinds. It is also not defined what the order of the kinds are in the character_kinds array, so either c1 or c2 or neither might be the default kind.
Granted I have not seen a processor that will actually print F for any of those, but it could and would still be considered standards conforming.
Sadly, we canât have a constructor for the string_type that can be used to set a parameter value.
One issue is that we are providing a constructor for the string_type to overwrite the default one. This is mainly to keep the raw value safe from user access and to help handle some corner cases where unallocated deferred length character variables are passed to the default constructor.
We could make the default constructor accessible by making the raw value public and require users to promise to not touch it under any circumstances, but this wonât allow the usage of the default constructor in a parameter initialization, because the component we want to initialize is allocatable:
/home/awvwgk/projects/src/git/stdlib/_build/src/stdlib_string_type.f90:44:52:
44 | type(string_type), parameter :: a = string_type("test")
| 1
Error: Invalid initialization expression for ALLOCATABLE component ârawâ in structure constructor at (1)
Using the user-defined assignment to initialize a string_type from a fixed length character value is also not possible because the procedure is user-defined.
Surprisingly, implementing the string_type already pushed quite hard on the limits of what is currently possible in Fortran. For example, we donât have an assignment(=) from string_type to character variables because this canât be implemented for both fixed-length and deferred-length characters in a consistent way.
Just to make it crystal clear: I am very grateful for the excellent work on string_type in stdlib.
My frustration came from the realisation that the current limitations of the language are more problematic than I imagined. I would dare to say (although I cannot be totally sure) that any major language in use today does not limit the user to this extent for something as trivial as setting a constant list of strings with elements of different length.
The most important lesson I learned from this is that user-defined assignment(=) and automatic reallocation of the LHS donât work well together:
the assignment(=) is a subroutine and therefore requires explicit declaration of the the LHS
adding the allocatable attribute for the LHS limits the assignment(=) to LHS variables with allocatable attribute
leaving out the allocatable attribute allows only non-allocatable or already allocated LHS variables in the assignment(=)
this scenario is most relevant for deferred-length characters and deferred-length parametrized derived types
I wonder what would be the required change to allow users to define an assignment(=) operator that works seamlessly together with automatic LHS reallocation.
There are a couple of other things that bring interesting complications with implementing a fundamental data type like a string in Fortran. For example the user defined derived type input and output (UDDTIO) really made a lot of trouble and is still in a half-finished state for string_type.
All this pain with string_type stems from the fact that Fortran has not been designed to be able to implement user types that behave like intrinsic types, or at least work consistently with other language features such as =.
If we expose the internal string, wouldnât that fix the problem? I thought you can have an array of different lengths strings as long as you keep the (allocatable) string as a component of a derived type and I thought that is precisely what string_type is for⌠So I am confused.
Making the internal representation public creates some interesting aliasing issues for at least the assignment(=) operator:
type(string_type) :: str
str = str%raw
Exposing the raw representation allows to hack around some issues, but it also introduces new ones. In particular, we lose the memory safety introduced by string_type as the raw value can always be unallocated.
Also, for the specific use case of using the default constructor, the allocatable attribute is the actual problem, which canât be mended by making the raw representation public.
If it is without a constructor, would it then allow to create an array of strings? I still donât understand what feature that we implemented blocks this use case. Yes, it might be that we will have to choose which nice features we can keep and which we cannot.
Still doesnât work for parameter initialization as I explained here:
Itâs a limitation of the Standard not out implementation which disallows using it to initialize parameters. So far there seems to be no hack which we could use to work around the Standard limitations here.
I believe that there are so many custom-made Fortran implementations of string arrays of varying length that at least might be possible to identify and pick the best design choices or, Iâd better say, the less worse considering the limitations of the language.
I should mention that I proposed a string_class for stdlib as well some time ago:
This implementation did allow libraries like stringiFor or ftlString to inherit from the same base class and subsequently make the seamlessly intercompatible with each other and stdlibâs string_type. But we dismissed the idea in favor of the string_type at this time. My repository with the implementation should be still around.