Fortran 2023 standard

Thanks. Yes, that is indeed a very limited set of cases.

Fortranners can check back 20 years later - unless the Community has really rallied behind initiatives such as LFortran and endeavored to implement facilities in the Fortran language entirely independently of WG5 and J3 and relegated the ISO / IEC standard development as merely an afterthought to simply and “officially” document something which is already standardized in actual practice due to vast community adoption, as it is with almost tools of the trade and associated methods and procedures in most industries - it will be the same exact situation as today.

Fortran 204Y may be up for discussion and 6 to 9 “wise guys”, the most influential on WG5 and J3 of whom do not engage directly with the community and stay in glass houses instead, will deem that basic facilities such as STRING and BITS types, which all other modern languages, coming up from scratch at a faster pace than the Fortran committees can work out the font aspects in the PDF document, offer their practitioners right from the get-go, as a matter of basic features its practitioners must have.

But not Fortran!! Keep on writing your own derived type variants around character(len=:), allocatable :: chars and logical(LOGICAL_NN) :: bitdat and what-not and “proudly” insert implicit none everywhere!!

Seriously, what nonsense is the language development (or the lack thereof) up to?!

@certik wrote above: “Why don’t you come up with some good Fortran syntax for this? We can easily create a prototype for this, since we already did the hard work of making it working in the middle end and backends.”

This seems like a good way to make progress independently from J3. A complete implementation with all of the details worked out should be much easier to standardise. And even if it doesn’t go into the standard, it could become a commonly-implemented extension. There is no shortage of those in Fortran.

So all we need is a good specification of a new intrinsic STRING type that fits in with the rest of the language.

3 Likes

Blaming J3 is much easier.

1 Like

That’s right. I think that’s the way to do it: a well designed extension, that many people here are asking for and would be happy to use. I am happy to supervise the implementation. But I need people here to help on the syntax and examples side and help design it and ensure that our implementation is working correctly (by testing it, reporting bugs, etc.).

1 Like

I am relatively new to the Fortran standardization effort. The standard itself makes it a bit challenging to understand the consequences of potential changes.

  • It is long. 2023 FDIS is 688 pages.
  • It is dense.
  • It has a very long history, and semantics discussions (recorded in the J3 papers) can be hard to come by or to analyze.
  • There are sections where we define somewhat formally the expected processor or user code behavior, but mostly the tool of choice is English prose.
  • I am unaware of tools for assessing the impacts of changes (beyond grep on the LaTeX source, and the gray matter of the committee members).

This week I learned a bit about how the ECMAScript (JavaScript) community updates their standard about every year. The current standard is 840 pages.

  • Any early goal of the standard was for it to be machine-readable, to support tooling for the standard developers and compiler implementers.
  • The standard can be translated into a formal model of expected behavior.
  • That formal model can be used as input to other tools (e.g., standard checkers, test generators, program analysis tools for JavaScript programs).
  • They have analysis tools that can evaluate program behavior against multiple versions of the standard. Like the behavior of their test suites.
  • They have integrated these modeling tools into the standards continuous integration process. Failure to produce a model is usually a bug in the standard.

(Note, they don’t have ISO to deal with, either.)

It is really impressive.

8 Likes

The above seems to be the gist of this whole thread. It would be great to know why it is so.

2 Likes

@certik,

As I have explained above, I will stress a few aspects of the desired facility including its semantics before diving deep into the syntax. But say for the sake of discussion the new type is named string. Then

  1. This new type shall be an intrinsic one and the standard semantics on intrinsic types shall apply to string starting with it not being an extensible type.

  2. This intrinsic type shall thus be declarable in any scope where other intrinsic types such as integer can be declared. Thus no USE statement shall be needed to import the type into a scope.

  3. string type shall be as though it has a private component of the intrinsic type character,

  4. The processor shall support at least two KINDs of this `string| type: one KIND as though the component mentioned in 2 above is of default character; the second KIND as though the character component is of ISO 10646 set.

  5. A convenient means to define a variable of this type using character literal constants shall be available.

  6. A convenient means to construct an array of string type with character strings of same or different lengths shall be available, possibly like so:

string :: pets(3)
pets = [ string :: "dog", "pony", "turtle" ]
  1. The same means to access sections of character data as applicable to character intrinsic type including with arrays of character type shall be available to string e.g.,
string :: language
language = "Fortran"
print *, language(1:3) ! outputs "For"
string :: pets(3)
pets = [ string :: "dog", "pony", "turtle" ]
print *, pets(1)(1:2) ! outputs "do" 
  1. Methods to operate on the character data of string data shall also be available as though they are type-bound. The list of methods shall be identified based on feedback from the Community and reviewed and developed in a workflow similar to Fortran stdlib. However the list shall include a method named insert to introduce a string of provided character(s) at the specified position pos e.g.,
string dilemma
dilemma = "to be the question"
call dilemma%insert( pos=7, chars="or not to be is" )

I have a longer list of additional requirements based on use cases involving library solutions that have been consumed for such a type and which I can provide over time.

However I just wanted to get the ball rolling where you and the readers to review the above 8 items and see how it’s received, what are the comments and the feedback, what you think is the feasibility for implementation in LFortran.

Thanks,

7 Likes

I agree 100% with all but point 8. Personally I’m not a huge fan of the type bound procedure syntax because it looks extremely similar to accessing a field of a derived type, and is thus confusing. Others may feel differently, so I’m not married to the idea, but I think leaving string intrinsics as subroutines accessed like normal would be better. I do not dispute that such routines need to exist along with the other aspects outlined for an intrinsic string type.

As an aside I have no opinion on the ISO kind character component, mostly because I have no idea what that gives me. Does it enable some Unicode values or something?

1 Like

What is the advantage of this compared to the normal character syntax:

dilemma = dilemma(1:6) // "or not to be is" // dilemma(7:)

Me neither. But beyond personal preferences, the consistency with the rest of the language matters:

  • all Fortran intrinsics are classical functions/subroutines, with few exceptions (e.g. %re and %im, but these ones can be viewed as components)
  • a string type would be a kind of extension of the character type, hence the same functions/subroutines should apply (when applicable), with the same syntax.
4 Likes

I can think of situations where, instead of having to count characters and find the position of a substring where new text is to be inserted, as in

string dilemma
dilemma = "to be the question"
call dilemma%insert( pos=7, chars="or not to be is" )

it would be more convenient to be able to write

string dilemma
dilemma = "to be the question"
call dilemma%insert( after="be ", chars="or not to be is" )
2 Likes

I disagree entirely with the “more convenient” aspect.

But a generic interface to INSERT procedure to allow for different options (modes), 1) at some specified position, 2) “after” something, etc. are worth strong consideration for the new STRING type design that might follow a community-driven workflow similar to stdlib.

Now, however, option 1 involving something like a POS argument is my first preference with this INSERT procedure. It is based on many use cases I have reviewed across quite a few codebases.

Could this be done with a replace() function instead?

1 Like

I have edited my previous post on desired semantics to state, “Methods to operate on the character data of string data shall also be available as though they are type-bound.” Note the “also”.

There are a couple of reasons, mostly intended to serve my colleagues who tend to be polyglot, much younger, and for whom Fortran is often a nth (n > 3) programming language in terms of their learning order and further down in their preference. But toward many of the apps they work on, Fortran should rise up to be their lingua franca again in the not too distant future, that is the vision anyway.

  1. they are most familiar with the notion of a class for a string type, as opposed to the “raw” char type, that has methods which operate on the data of an instance of the class. In other words, the type-bound aspect is intuitive to them.
  2. there already are library solutions in Fortran that some of them use which are similar to the string_type of Fortran stdlib and which have type-bound procedures. A desire for a similar structure with TBPs in the eventual intrinsic string type has been expressed to me by several of them.

Outside of the traditional Fortran parlance, many coders are quite savvy about OO and know well the differences between data fields and methods. I think the risk of any confusion with a string intrinsic type and TBPs is quite low among the future generation of Fortranners, who come to Fortran with a strong background in these other paradigms.

2 Likes

A new topic should be opened to discuss the specifications of a string type…

1 Like

FD Admin(s): if it’s not too much effort, as suggested by @PierU , can you please move the discussions on the specifications on STRING type to a new thread, perhaps starting with @certik’s leading post here?

@certik,

Based on your experience and knowledge of LFortran compiler development, do the two things mentioned above - a convenient array constructor for STRING type and an accessor for substring references similar to CHARACTER type - look feasible to you technically in LFortran, or do you think it’s too challenging to implement in LFortran?

This is important to understand because this is the crux of the matter. The rest of the syntax is not too difficult, as @jeremie.vandenplas pointed out and you noted, the string_type in Fortran stdlib is a ready go-by for LFortran prototype to consider.

1 Like

Since this is within the discussion of new fortran features, what does everyone think of the following capability?

string, save :: pets(3) = [ string :: "dog", "pony", "turtle" ]
string :: x(3), y(3)
x =  [ string :: "do", "po", "tu" ]
y = pets(:)(1:2)  ! should be the same as y = x

This kind of array assignment does not work with arrays of allocatable components because the underlying memory is not accessed with regular strides. But with the string type, the underlying memory would never be expected have regular strides anyway, so it seems like this would allow this functionality for this new intrinsic data type, even if it is not allowed for integer, real, logical, or character allocatable components.

3 Likes

That type of functionality could definitely come in handy. What would happen if you had instead used y = pets(:)(1:4), noting that pets(1) is only len 3?