I was interested! But after inspecting a bit I got scared by performance implications. I saw then the github issue about time complexity and the discussion on HN and decided to roll my own, loop-based, implementation of union
.
Some more info about what could have been my use case, if it is interesting to you:
This morning I stumbled upon the problem of building a set union of many small arrays, with (a lot) of values in common (in the worst case scenario the size of the union would be orders of magnitude smaller than the size of a simple concatenation). So I started a google hunt for an efficient implementation of such a thing. When I stumbled on your repo my first reaction was “yes, a clean, well documented, full-fledged library focusing on this domain, perfect!”.
But then I started inspecting the source code (I have arrays of derived types, so I could not just include the dependency, but instead try to copy the algorithm in my own code).
Two things immediately came to my attention
-
The body of
union(a,b)
itself deferres the functionality toset
(which seems kind of obvious) applied to the merge of the two arrays. But such a merge is performed as[a,b]
which I think is far from performance-optimized. I’m almost suremove_alloc
should be used instead (but happy to be proven wrong, the automatic concatenation is infinitely more intuitive and elegant, syntax-wise). -
The implementation of
set(a)
is recursive which has rung the definitive bell. Too much worry about stack management. Still, given the excellent presentation of the library in the README I thought I should be mistaken and came here to investigate about tail-call optimization in fortran and so on, to find that no, the entire project has been carried with no attention about this kind of issues, the scope itself was different[1].
This is perfectly fine, don’t get me wrong, but it calls in my opinion for a clear statement about scope and status in the README. Probably something as eye catching as a shields.io badge like:
The
set
implementation could have well been done in a loop, without even really changing the logic: you loop on all the indices of the input array and, if the condition you already use inside thepack
call is satisfied, you copy the value in a preallocated —set to all zeros— comformant array. Then, when you finish your span, you call just oncepack
on this array with thearray/=0
condition and yeah, that’s your result. All this to say that I get that your implementation was more about an intellectual experiment, then production code. And that’s what I mean with “different scope”. ↩︎