Containers using F202Y's generic programming

I was intrigued by the topic, so I checked.

It seems that lists – at least as understood in CPython – are actually contiguous arrays of references. Thus somewhat analogous to std::vector.

Python list vs C++ vector

What is a list to you?

In Python, as @HugoMVale wrote, the list is indeed just like std::vector, with capacity and size, the only difference is that in Python the type of each item is PyObject which effectively allows you to put any type in them. While in C++ (and I am suggesting in Fortran) you typically use a more concrete type, like int, but you could also use an abstract class.

1 Like

This is a good question. I think different people here have different ideas about the features of a list. Or an array.

To me the main difference between an array and a list is that the elements of an array can be easily referenced with an array index. In fortran, that typically means that the elements are a fixed distance apart in memory, so if you know the memory location of one element and the spacing between elements you can determine the location of any other element with just a multiplication and an addition. That also implies other things about the features of an array, namely that it has a fixed size in memory that once established cannot be exceeded. So if an array is overallocated, it is alright to work with subsets of those memory locations (like the stride notation in fortran allows), but it is then difficult to remove elements from the middle of an array, insert new elements into the middle of an array, or to add new elements that would extend beyond the original allocation.

A list on the other hand is not (necessarily) contiguous in memory, so its elements cannot be simply referenced with an index. Depending on how the list is stored, elements of the list are usually accessed by scanning through the elements, starting at some known element and then proceeding until the nth element is located (or until some value is located, etc.). That scan operation might be one-by-one, as in a simple linked list, or it might be in variable size steps, as with a binary search tree or heap or other such data structure. Existing elements of a list can be deleted, and new elements can be inserted anywhere (beginning, middle, end). The size of the total list can grow and shrink as necessary, and there are often balancing and garbage collection operations that must be performed in order for the list to be accessed efficiently.

All of these data structures can now be implemented in fortran in a straightforward way. That was not exactly true with f77 and earlier versions of the language, where arrays of elementary types were the only supported data structures. In f77, lists could be simulated by using underlying arrays, but some features could only be supported with difficulty (say heaps of arbitrary size). Programmers back then were clever however, and they could do some surprising things. A heapsort for example could be implemented in f77 as efficiently as any other language within a fixed-size array. Other data structures, such as arbitrary-sized linked lists had to wait for f90 and later. This is one of several reasons why fortran was the dominant language in the 1960s and 1970s in science and engineering, but lost its reigning position through the decade of the 1980s.

I also agree with the answer above. A vector (or better a dynamically resizable array) is to me something, where the elements are contiguous in memory, while in a list they are not. Former has all the advantages of an array, with the disadvantage, that inserting elements into the middle (or extending the list, if the preallocated buffer is exhausted) can be expensive. The latter has the advantage of being easily extensible, but being not very cache efficient, when accessing elements consecutively. As mentioned above, Python implements the list type as a contiguous array of pointers pointing to individual elements of the list. So the list elements themselves are not contiguous in memory.

OK. I thought that Python lists were rather similar to the C++ std::list, which is supposed to be a linked list (with O(1) complexity for insertion, etc…).

A list can be infinite (recursive), and yet be represented with finite resources.

This, and the subsequent, differing answers are why these data structures have not been standardized. This has in fact been the prevailing wisdom in language design for decades: don’t “bake in” data structures and algorithms into the language, because different use cases will need different implementations. Rather, make it possible for programmers to define their own implementations of data structures and algorithms.

The committee has traditionally, and even recently, taken the position that when there are multiple options for a language feature that are not obviously better or worse, we try to sidestep the debate. The recent example was how does prefix_sum work with a mask? The committee decided that, rather than continue to argue about it, we’d just not have mask argument.

For common data structures like dictionaries, lists, sets, etc. I’d expect a defacto standard implementation will emerge, but I think just like library implementations preceded the standard definitions in C++, that we should allow that to happen in Fortran as well. That way the committee doesn’t have to spend time arguing about what the correct definition should be, we can just standardize existing practice once it’s established.

3 Likes

I find this statement extremely amusing.

In the case of the different generics proposals, the committee is aware that the traits-based proposal, that I linked to earlier in this thread, is actually far superior to both the committee’s own “templates”, and the proposal for generic procedures that was submitted by the Japanese group.

The traits based design supports, holistically, uniformly, and transparently, both procedural, functional, and object-oriented generic programming. It does all of this in a general, clear, concise, and very user-friendly manner. It will thus make Fortran a truly modern language. None of the features that the committee has come up with itself can achieve any of this, because (as it is documented on this site) they are confusing, overly verbose, and not interoperable with the language’s object-oriented features. In short, they are terribly designed.

Yet, the committee continues to push into the standard these unsound, half-baked features, that are too-little too late, and obsolescent before they’ve even been published. It is more than obvious that the committee is applying a double standard here.

I expect the committee to walk the talk, and not simply make empty statements
like the one I quoted above. Refrain from bloating the language by pushing those half-baked “templates” into the standard. Wait until the traits based generics have been prototyped in LFortran. Let the users decide which approach they prefer, and then standardize existing practice!

Anything else I would consider a disservice to the Fortran community.

6 Likes

Actually, I would argue Fortran has done exactly the opposite: it baked data structures and algorithms into the language, starting from multidimensional arrays, complex numbers, many array algorithms (matmul!), special functions, etc.

It is true that different use cases need different implementations, and indeed Fortran chose to bake the above in, and leave it to the compilers to provide maximum performance.

So as a user it is very easy to use what the language provides and in my opinion that is what made Fortran successful: it has everything baked in, what you need for scientific numerical computing, at least historically.

See my comment above about “where to stop”: Containers using F202Y's generic programming - #57 by certik.

Regarding lists, I am proposing the one to bake into the language is the std::vector one, like LFortran did (or Python, although it uses pointers, as mentioned above, but that’s just how Python works, but its list is indeed just like std::vector of pointers), not the linked list, which you can build yourself and I am arguing that it is much less used in practice.

You can argue that maybe it’s not needed for numerical computing. I personally think it would be useful, but there is a cost of every feature, so it’s not black and white.

However, I agree with the spirit of your post, that we want to have an implementation first in compilers before standardizing, and actually get the features used by users. Absolutely.

As @kkifonidis also said, I recommend you follow the spirit of your post (which I agree with!) for generics as well. :slight_smile:

3 Likes

I agree with most of your points, but this one is off target a little. Taking matmul() for one example, I do not know of any fortran compiler that achieves maximal performance for this operation with its native implementation. It provides a quick way to perform that operation when writing code, but if you want maximal performance, then you need some other library (Apple accelerate, intel MKL, OpenBLAS, etc). And further, I would say that a function interface to that operation cannot provide maximal performance, you need a subroutiine interface for that, so the limited performance is baked in by design.

Actually, I would argue Fortran has done exactly the opposite: it baked data structures and algorithms into the language, starting from multidimensional arrays, complex numbers, many array algorithms (matmul!), special functions, etc.

This might be a consequence of the missing standard library.

In my opinion, algorithms should be placed in libraries, fundamental ones in a library that is defined as part of the standard. An example are the Fortran 2023 trigonometric functions that take arguments in degrees or half revolutions. I absolutely don’t see the need to have them as part of the language definition and I still dream of a standard library and namespaces.

Language features should be reserved for things that are difficult to implement otherwise. Fortran’s static and strong typing makes user-defined implementations of lists and dicts very difficult so far and I fear that the convenience of Python will never be reached. Hopefully, the new generic programming features will enable dicts and lists with a reasonable user experience. Otherwise, they should become part of the language. A list that supports heterogeneous elements would even be a perfect substitute for strings of varying length.

The miracle of Fortran is that although it started as one program, the Automatic Coding System for the IBM 704, with everything baked in, it evolved into a different beast altogether. There is not a single algorithm or data structure that I can think of that is mandated by the language definition.

I’m still trying to understand why anyone on the Standards Committee thought that

A(@[3,5]:[9,10])

is a better more readable syntax than what was already in place in Fortran, viz

A(3:9, 5:10)

I’ve always looked at things like this as “pet projects” of some influential member of the Committee that might meet a specific need for one of their paying customers but adds little to no value to the language as a whole. I think the new trig function capabilities also fall into the “pet project” classification. We all have our own prejudices about what should be in the language and what shouldn’t but I continue to see things added that based on my experience with Fortran (over 50 years now) have only a limited or narrow use to the larger universe of Fortran programmers. But suggest even a very limited change to an existing feature that logic and experience shows would improve the language for all users and some committee members will come up with a million excuses as to why it shouldn’t be done. We all know the usual suspects (backwards compatability, breaking existing code, etc). I agree that most algorithms belong in a library and thats not really the issue for me. The issue is how a user accesses that library. It should be seamless and feel like its an integral part of the language even if its not. Like others I think that at least a std:vector type or something that provides the same functionality should be a part of the language. One of the things on my personal “want” list is the MATLAB set utilities. I see these as a natural extension of Fortran’s exisiting array utilities like findloc, cshift etc. No new data type would need to be created, just efficient algorithms for “masking” off unwanted elements and merging array components.

Not really the main topic of this thread, but I have to defend the committee in this matter. This is not about readability, but about flexibility. Whenever you can get away with the old syntax, you definitely should use it as it is much more readable as you point out.

However, if you want to write generic algorithms, which work for arrays with different ranks, you might need the more flexible syntax to be able to deal with the actual ranks of the actual arguments. Think about a generalized matrix product \sum_{k} A_{i…jk} * B_{kl….m} = C_{i…jl…m}, where you want to multiply over one index only and you would like to make it work for arrays A and B of arbitrary ranks. With the new syntax, the indexing arrays can be allocated dynamically according to the actual ranks of A and B. With the old syntax, you would have to write a specific subroutine for any combination the ranks of A and B can have. (Thinking about arrays up to 15 dimensions, that would account to 15^2 routines). An example for such a routine is Numpy’s/Pytorch’s einsum() routine.

1 Like

O.K. I see now but couldn’t this have been done with either the existing RANK statement or an extension of RANK and not resort to a (to my eyes) very strange and “unfortranic” syntax. Also, just how many users in the Fortran universe actually need this capability. The only people that immediately come to mind are people writing linear algebra libraries. The most number of ranks I’ve ever seen used in the areas I’ve worked in (FEM and CFD) are 9 and that is to hold a constituative tensor (which is usually collapsed using symmetry to smaller ranked arrays using something like Voight notation). I think for most application areas the ranks of the required maticies/arrays are known a priori.

1 Like

I don’t use those kinds of functions often, but I’m surprised at the animosity displayed toward them in this thread. Adding them doesn’t hurt anyone, there aren’t any backward compatibility issues involved, and those names don’t really have a better use for some potential future purpose. Also, if you have ever written one of those routines yourself, you know it is not a trivial exercise to get all of the argument reduction right and to get the right results down to the last bit. I was glad to see them added even if I seldom need them.

There are a lot of things proposed by users for addition to the Standard that would meet this criteria. However, because they were proposed by Joe User and not a member of the Committee they are never considered or just dismissed outright.

That’s the point I am making. Fortran bakes all the useful features into the language and leaves it to the compilers to optimize, without mandating how it has to be done, leaving lots of freedom.

There are actually some that are mandated, for example I believe the array of complex numbers have to be interleaved (real, imag, real, imag, …), I think something breaks if a compiler wanted to implement them as separate (real, real, …; imag, imag, …), even though I think some operations might be faster on the separate representation.

First of all, many compilers allow you to dispatch matmul to internally call MKL/OpenBlas/Accelerate, etc., so you indeed get maximum performance. Regarding functions returning arrays being slower than subroutines, are you referring to the temporary copy that some compilers do? My understanding is that you don’t need to create a temporary, for example LFortran doesn’t, we convert all functions that return arrays into subroutines. I agree that compilers often come short to giving maximum performance for matmul, but I consider it the fault of the compilers, not Fortran. One of the reasons I decided to do something about it, I think Fortran has a lot more potential that has not been realized yet.

3 Likes

I guess that the new unfortranic syntax also opens up the possibility to easily set the bounds of an array according to the bounds of another array:

real :: b(@lbound(a):ubound(a))
! instead of (for a rank 3 array)
real :: b(lbound(a,1):ubound(a,1),lbound(a,2):ubound(a,2),lbound(a,3):ubound(a,3))

At least the half-revolution versions are recommanded operations in the IEEE-754 standard (page 60).

2 Likes

Yes, I use that feature often with gfortran. My comment was specifically to the “native implementation” of matmul(). When you specify the option to replace that native implementation with an external library, that demonstrates my point, it doesn’t contradict it.

Regarding the temporary copy, that includes the temporary copy that sometimes occurs for the arguments of matmul() (e.g. how transpose(A) or conjg(A) might work), and also for the array result (which is typically on the stack or sometimes the heap). Of course an optimizing compiler can do magical things with its intrinsic operators, including inlining code, but in practice those things just aren’t done now, and they have had some 35 years to figure out how to do them, so if it hasn’t happened by now it probably never will. So for the past, present, and foreseeable future, if a programmer wants maximal performance for a matrix-matrix product, it is best if he calls an external procedure (such as the BLAS sgemm or dgemm) rather than relying on matmul(). This is especially true when the result is being accumulated from a sequence of products, where each product invokes the memory allocation step. Yes, an optimizing compiler might do some magical things to do a single allocation to achieve the same performance, but with the external library, the programmer has control over the operation, and he can do that single allocation if appropriate, he need not rely on compiler magic.

Having said all that, there is also the fact that matmul() has more flexibility in its arguments than sgemm/dgemm. In the BLAS calls one of the array strides of each argument array must be 1, whereas matmul() allows arbitrary strides of each of its two arguments. So there is room for improvement here, where the programmer is given even more flexibility than the current situation. The ideal situation would be a subroutine interface (which allows the programmer more control over the temporary array allocations), where transpose and conjugate operations can be specified without making temporary copies, where the result array can be incremented (alpha and beta scale factors), where all array arguments are assumed shape (no imposed unit strides), and where the matrix-matrix, matrix-vector, vector-matrix, and vector-vector special cases are identified at compile time, not with run time checking. There are several choices in how each of those could be done. Maybe all of that is too much to ask?