Containers using F202Y's generic programming

With all due respect, the problem with the current state is the Fortran bus is still being driven to a large part by the “Linear Algebra folks”. They are apparently clueless to what it takes to write modern simulation codes in a lot of areas that Fortran used to dominate. Try wandering out of your own little bubble and make an effort to see just what data structures and algorithms people need today, not 40 years ago.

3 Likes

I have no idea what you are talking about. Problem here is technical, there is no way in the language to efficiently manage collections in a memory-safe way. If I am wrong, prove me.

Now that we finally all agree that basic data structures are a neccesity (and I don’t think anyone sane will disagree), the discussion only is whether they should be part of the language (like Python), or part of library, requiring very large extension of the language (like C++). I do not believe the latter will work, because these days even simple OOP is so buggy it is unusable when you try to test it on 10+ compiler versions and linux distros – and this is the minimum level of support required for any feature to be adopted.

Both.

Like in Python which has both in the language, as well as part of a library.

And like in C++ that also has both: even though you can do arrays and parallel loops as part of a library, they are still standardizing it to become part of the language, and for basic containers even though technically they are implemented in the C++ standard library, since they are “standard”, i.e., part of the language, compilers can implement special handling of them, in fact our prototype C/C++ compiler (GitHub - lcompilers/lc: C++ compiler) does exactly that, it’s not compiling std::vector as a library, but rather understands the syntax and uses the same internal “list” implementation that LFortran is using.

So I think they should eventually be standardized (after compilers implement prototypes and users like how it works), the only remaining question is the syntax, should we extend Fortran syntax for the basic containers, or should we use the (to be done) generic features. But internally in the compiler you want to have first-class understanding of these basic containers anyway, for performance and possibly better error messages.

Yes, I think it’s easier to implement just the containers directly like we did in LFortran.

However, as an orthogonal point: since Fortran has OO, I think the OO should be done “right” and it should be fixed, so I really like @kkifonidis’s proposal. And if done right, of course it can also be used to create custom containers, but I see that as an orthogonal feature.

1 Like

I think the operative word here is “basic containers”. These are needed to build more elaboratte “custom containers” like quadtrees/octrees and kDtrees. At a minimum, I think all you really need are:

  1. A dynamic array (list ?) ala C++ vector
  2. A doubly linked circular list
  3. hash maps/tables and dictionaries
  4. MATLAB like set functions (these don’t have to be OO. Just work on the existing array structure).
    Other containers like stacks, queues, priority queues. red/black trees etc. could be deferred to a STL like library because they would probably be built using the basic types above. I also don’t need these to support every possible combination of intrinsic types. Just a couple of integer and real types would be sufficient for a first step. If you initially limit the scope in terms of what data types are supported (and expand it later), I don’t think that having the four containers etc. I list above as an integral part of the language would be a major development task. How you implement these things has been known for decades. The only thing lacking is a will to do it.
2 Likes

Starting with 4), what would be the functionality missing from GitHub - urbanjost/M_sets: basic set functions that would be required to fullfill that need? My need for that functionality was very specific to a particular project so I have not done much with it since that was completed, but if it would not be that much of a change to make it generically useful perhaps it could find a home in stdlib and be further developed (or perhaps there are other similar seeds available I am not aware of).

@urbanjost, I’ll have to take a look at your code, but I’ve implemented clones of the MATLAB set functions in a way they are for the most part drop in replacements for the equivalent MATLAB functions. I limit support for now to all the intrinsic integer types defined in ISO_FORTRAN_ENV as well as REAL32 and REAL64. Basically, once you have a UNIQUE function, you can use that to build the rest of the MATLAB functions.
Here are my comments describing the functions in my code.

! Utility routines that mimic some of the MATLAB set utiliies.

! The following routines are supported for rank 1 arrays of type INT8, INT16
! INT32, INT64, REAL32, REAL64

! unique    - finds the unique values in an array and returns in sorted order
! union     - merges two arrays and returns unique values in sorted order
! intersect - finds the intersection of two arrays (common values) and returns
!             in sorted order
! setdiff   - finds values in set A not in set B with no repetitions and 
!             returns in sorted order
! ismember  - Returns logical array showing which values in A are in B.
! setxor    - Finds values in A and B not in intersection of A and B and 
!             returns in sorted order  

! Most of the functions supporting floating point values allow an optional
! user_tolerance value to be input that is used in comparison tests. A default
! value of 1.0E-6 is used for REAL32 arrays and a default value of 1.0E-12 is
! used for REAL64 arrays if the optional user_tol argument is not present. Note
! this allows the same functionality as the MATLAB ismembertol and uniquetol
! functions without creating explicit versions of those functions

My use case for these is building connectivity arrays for an unstructured grid CFD code that supports several mixed cell topologies.

2 Likes

Looks like parallel evolution; where I had a need for static order to be optionally available; and I did not provide some warranted options regarding comparing floats; … but sounds like a lot of similarities regarding the interface/API and overall functionality.

Gee, it will be so incredibly kind to the poor, persevering, practitioners of Fortran if those in charge of the language - such as those now who are able to be on the standards committee, here note only a few, mostly in the US are fortunate relative to silent masses globally - would simply come off their different, different high horses, get into a just do it mode, and add

string data type, another basic data structure!

Note one can justifiably argue the CHARACTER data type in Fortran introduced starting ANSI FORTRAN 77 is a basic data structure with the KIND and LENGTH-type parameters.

Now a string type would do wonders for the language.

For a Fortran author to be able to write an array of strings like so will be such a blessing:

..
string, parameter :: s(*) = [ "Mary", "had", "a", "little", "lamb" ]
..
string, allocatable :: pangram(:)
..
print %, s(2)%len
if ( s(1)(1:1) == "M" ) then
..
..

pangram = [ "The", "quick", "brown", "fox", "jumped", "over", "the", "lazy", "cat" ]
..
pangram(9) = "dog"

The above is so desired as one can notice with so many attempts at “roll your own” string “classes” in user libraries.

But all of them fall so short of what is possible in the language itself.

Heck, the CHARACTER data type - a basic data structure - itself won’t be possible in practitioner authored libraries without the so-called “compiler magic” in Fortran.

A simple rule shall be if "compiler magic" is needed, then it shall be part of the base language. Here, examples also include MAX, MIN, FINDLOC etc. intrinsics - .

Sigh! No good advice goes unrejected is the mantra in the world of Fortran language evolution.

2 Likes