Fastest sorting for a specific genre of unordered numbers

An easy way to sort a contiguous Fortran array is to build upon the C++ standard library functions std::sort and std::stable_sort. You do however need a compiler that supports the std::span container introduced in C++20. The code demonstrated below was tested succesfully with gcc 11.1.0

Here are the C++ routines:

#include <algorithm>
#include <numeric>
#include <span>

extern "C" {

// Finds the indices that would sort an array
//
// Performs an indirect sort of the rank-1 array `a` of size `n`. 
// Indices are returned in the array `index`.
// Fortran or 1-based numbering is assumed for the `index` array.
//
void cpp_argsort(int n, const double *a, int *index) {

  std::span inds(index, static_cast<size_t>(n));
  std::iota(inds.begin(),inds.end(),1);

  auto compare = [&a](const int& il, const int& ir) {
    return a[il-1] < a[ir-1];
  };

  std::sort(inds.begin(),inds.end(),compare);

}

// Finds the indices that would sort an array while preserving order
// between equal elements.
//
// Performs an indirect sort of the rank-1 array `a` of size `n`. 
// Indices are returned in the array `index`.
// Fortran or 1-based numbering is assumed for the `index` array.
//
void cpp_stable_argsort(int n, const double *a, int *index) {

  std::span inds(index, static_cast<size_t>(n));
  std::iota(inds.begin(),inds.end(),1);

  auto compare = [&a](const int& il, const int& ir) {
    return a[il-1] < a[ir-1];
  };

  std::stable_sort(inds.begin(),inds.end(),compare);
}


} // extern "C"

First, extern "C" keyword is used to generate external linkage and prevent compiler name mangling. The dummy variables are all C/Fortran compatible. The span container is used to “wrap” the raw contiguous index buffer with the whole C++ iterator machinery. Next we use std::iota to fill the index array with sequentially increasing values starting from 1 due to Fortran indexing. A lambda expression is used to perform the comparison, by referencing into the original array of doubles a with a capture y reference. Finally, we pass the iterators and lambda compare function to the actual sorting routine.

Here’s a Fortran driver that calls the procedure:

program test_cpp_argsort

   use iso_c_binding, only: c_double, c_int
   implicit none

   interface
      subroutine cpp_argsort(n,array,index) bind(C,name="cpp_argsort")
         import c_double, c_int
         integer(c_int), intent(in), value :: n
         real(c_double), intent(in) :: array(*)
         integer(c_int), intent(inout) :: index(*)
      end subroutine
      subroutine cpp_stable_argsort(n,array,index) bind(C, &
           name="cpp_stable_argsort")
         import c_double, c_int
         integer(c_int), intent(in), value :: n
         real(c_double), intent(in) :: array(*)
         integer(c_int), intent(inout) :: index(*)
      end subroutine
   end interface

   integer, parameter :: n = 5
   real(c_double) :: a(n)
   integer(c_int) :: inds(n)

   a = [real(c_double) :: 1., 5., 4., 2., 3.]

   write(*,'(A)') "Before:"
   write(*,*) a

   call cpp_argsort(n,a,inds)

   write(*,'(A)') "After:"
   write(*,*) a(inds)

end program

Finally, the result:

$ g++ -c -std=c++20 -O2 -march=native cpp_argsort.cc 
$ gfortran -Wall -O2 -march=native test_cpp_argsort.f90 cpp_argsort.o -o test -lstdc++
$ ./test
Before:
   1.0000000000000000        5.0000000000000000        4.0000000000000000        2.0000000000000000        3.0000000000000000     
After:
   1.0000000000000000        2.0000000000000000        3.0000000000000000        4.0000000000000000        5.0000000000000000    

At least for the built-in types which are interoperable, using std::span is an easy way to unleash the full potential of the C++ standard library algorithms.

3 Likes