What are your opinions of adding @milancurcic 's Functional-Fortran library to stdlib? As a functional programming fan and given its benefits in expressing ideas concisely, I am all for it. I think it would be an excellent enhancement.
I’m a fan as well but I’m not convinced it’s a good fit, for a reason that I’d concisely express as:
What’s useful is not performant; what’s performant is not useful.
tail, as well as recursive stuff like
fold are useful for a functional style but if you use it you’ll be banished from HPC land. Something about redundant array copies and Fortran compilers not dealing well with many levels of recursion. Maybe it improved since.
Some of it is redundant with F2018 (e.g. use the more Fortrannic
[integer ::] instead of
empty). Some of it is in
Of the bunch, perhaps
map is the best candidate. It does largely what
elemental does, but without the restrictions of
elemental. Before F2018 that was useful for
recursive procedures, but now you can do that. I believe you still can’t pass
elemental procedures as procedure pointers. Maybe
map could be useful there.
That said, does anybody use this as a dependency? I don’t, but it was a fun exercise.
That is good to know. If in general, the functional implementations in Fortran are not as performant as the non-functional counterparts or implementations, that is understandable.
When declaring a zero-sized integer array named empty I found that
[integer ::] was not needed. The declaration
was enough, even when it was an actual argument of a subprogram where the dummy argument was intent(in). But I wasn’t trying to declare empty as a parameter, and I would have needed different names for empty arrays of different kinds or ranks. (I had started by trying just [ ] because in mathematics and logic there is only one empty set, but I needed reminding that Fortran arrays are not sets.)
I was interested! But after inspecting a bit I got scared by performance implications. I saw then the github issue about time complexity and the discussion on HN and decided to roll my own, loop-based, implementation of
Some more info about what could have been my use case, if it is interesting to you:
This morning I stumbled upon the problem of building a set union of many small arrays, with (a lot) of values in common (in the worst case scenario the size of the union would be orders of magnitude smaller than the size of a simple concatenation). So I started a google hunt for an efficient implementation of such a thing. When I stumbled on your repo my first reaction was “yes, a clean, well documented, full-fledged library focusing on this domain, perfect!”.
But then I started inspecting the source code (I have arrays of derived types, so I could not just include the dependency, but instead try to copy the algorithm in my own code).
Two things immediately came to my attention
The body of
union(a,b)itself deferres the functionality to
set(which seems kind of obvious) applied to the merge of the two arrays. But such a merge is performed as
[a,b]which I think is far from performance-optimized. I’m almost sure
move_allocshould be used instead (but happy to be proven wrong, the automatic concatenation is infinitely more intuitive and elegant, syntax-wise).
The implementation of
set(a)is recursive which has rung the definitive bell. Too much worry about stack management. Still, given the excellent presentation of the library in the README I thought I should be mistaken and came here to investigate about tail-call optimization in fortran and so on, to find that no, the entire project has been carried with no attention about this kind of issues, the scope itself was different.
This is perfectly fine, don’t get me wrong, but it calls in my opinion for a clear statement about scope and status in the README. Probably something as eye catching as a shields.io badge like:
setimplementation could have well been done in a loop, without even really changing the logic: you loop on all the indices of the input array and, if the condition you already use inside the
packcall is satisfied, you copy the value in a preallocated —set to all zeros— comformant array. Then, when you finish your span, you call just once
packon this array with the
array/=0condition and yeah, that’s your result. All this to say that I get that your implementation was more about an intellectual experiment, then production code. And that’s what I mean with “different scope”. ↩︎
Right, it was an exercise in style, not performance. I don’t even know how fast or slow it is. Some compilers may be better than others for heavily recursive stuff like this. I wouldn’t call it experimental because to me that suggests it may or may not work. It does work. I’ll add a note about not optimizing for performance and YMMV.
Yes, you’re right about the misleading message an “experimental” badge could convey. I was in lack of a better word. Such a note would work perfectly!
3 posts were split to a new topic: Implement set union of many small arrays
@milancurcic Could you explain with a basic example why this is ? or point me to a resource that does.
Because recursive Fortran code tends to lead to less performant binaries than the same algorithm expressed with loops. And why that is, I don’t know. Perhaps Fortran compilers have prioritized optimization of loops over recursion, and perhaps recursion is just more difficult to optimize. I haven’t played with this stuff in years and things may have changed. Compiler experts here like @kargl, @gak, or @certik, may know whether and why recursive code is more difficult to optimize than loops.
The classic technique for optimizing recursion is tail call elimination. This recognizes that no computation is performed after the recursive call, an replaces the call with a jump. This effectively replaces the recession with a loop.
My guess is that few compilers bother to do this analysis for cost-benefit reasons. My guess is that if you looked at a million lines of Fortran, you would see tail call optimization opportunities appear a handful of times, if at all.
And one can probably make the case that array-oriented algorithms are not (usually) naturally solved with recursion.
I use GCC and sometimes, I see a function has been tail call optimized but only when I am in a fortunate enough position to program with that in mind. I think the issue is the compiler has to be certain that the optimization can be safely done and that’s not that easy. My untested, gut hunch is that if the function is declared “PURE” the compiler may get a better hit rate on the tail call optimization. I program a reasonable amount in C and the ability to return a value or call a function as part of the return definitely aids in getting tail call optimization eg.
n = function_x(…)
as the compiler can more easily divine what is going on, assess the function and perform the optimization.