Fastest sorting for a specific genre of unordered numbers

Thanks, @ivanpribec, for your highlighting the differences between qsort() and std::sort. It would be useful to Fortran programmers to have a recipe for calling std::sort from a Fortran program (including a recipe for specifying the comparison function/functor in C++, if necessary, or in Fortran, if possible).

I read a couple of your references, and I wondered why, when comparing the qsort() of C and the std::sort of C++ they did not note that the two routines/ methods may well use different algorithms. For example, qsort() may use mergesort in one C library, and quicksort in another C library! Furthermore, the scaling law for sort time versus array size may be different for the two implementations, especially if even one of the two uses a hybrid algorithm.

@mecej4, you make several good points. Upon re-reading the Stack Overflow threads, a few less visible comments did remark the differences could be attributed to different algorithms. We would need to look at the C and C++ runtime-library implementations to know for sure. From the pragmatic point of view C++ seems to have a small edge. Both routines are black-boxes from the user perspective.

With regard to a more general recipe, are you interested in sorting also other types than the intrinsic ones? Is an indirect sort sufficient, or would an in-place sort also be desirable? For derived types, the C++ code will be a somewhat more complex because the size of the array elements has to be communicated manually across the language barrier.

One more difference is that the C++ template library makes it very easy to write an indirect sort since the “lambda” comparison function can capture variables from the enclosing scope.

With the standard C qsort you would need to pack your keys and indexes into an array of structures, leading to duplication. The fundamental issue is that C doesn’t support nested functions.

That raises the interesting question whether a Fortran internal subprogram can have the bind(C) attribute and could hence be used to do an indirect sort using C’s qsort?

1 Like

Counter to what I expected, it’s perfectly feasible. Even more surprisingly, host association in internal functions permits using qsort for indirect sorting, which is not possible in pure C without the use of global variables.

Here’s a demonstration:

module sorting
   use, intrinsic :: iso_c_binding
   implicit none
   private
   public :: argsort
contains

   !> Returns the indices that would sort an array
   !>
   !> Arguments:
   !>   n: number of elements
   !>   list: on entry, list of indices 1..(1)..n 
   !>           (for reverse sort, n..(-1)..1)
   !>         on return, the indices such that keys(list) would be sorted
   !>   keys: array to be sorted
   !>
   subroutine argsort(n,list,keys)
      integer, intent(in) :: n
      integer, intent(inout), target :: list(*)
      integer, intent(in) :: keys(*)

      interface
         subroutine qsort(ptr,count,size,comp) bind(c,name="qsort")
            import c_ptr, c_int, c_size_t, c_funptr
            implicit none
            type(c_ptr), intent(in), value :: ptr
               !> pointer to the array to sort
            integer(c_size_t), intent(in), value :: count
               !> number of elements in the array
            integer(c_size_t), intent(in), value :: size
               !> size of each element in the array in bytes
            type(c_funptr), intent(in), value :: comp
               !> comparison function which returns ​a negative integer value 
               !> if the first argument is less than the second, a positive 
               !> integer value if the first argument is greater than the 
               !> second and zero if the arguments are equivalent
         end subroutine
      end interface

      if (n < 2) 
         ! list already sorted
         return
      end if

      call qsort(c_loc(list),int(n,c_size_t),c_sizeof(list(1)), &
         c_funloc(argsort_compare))

   contains
      ! technically, the inputs should be void pointers
      ! but works nevertheless
      integer(c_int) function argsort_compare(a,b) bind(c)
         integer(c_int), intent(in) :: a, b
         argsort_compare = keys(a) - keys(b)
      end function
   end subroutine
end module

program main

   use sorting
   implicit none

   integer :: list(4)
   integer :: keys(4)

   list = [1,2,3,4]
   keys = [7,5,6,3]

   call argsort(4,list,keys)

   ! Supposed to print [ 3, 5, 6, 7]
   print *, keys(list)

end program

Notice the comparison function is encoded in Fortran. So the conclusion in my previous post is curiously enough - wrong. You can perform an indirect sort using qsort without global variables from Fortran. :exploding_head:

Further, we could give argsort the bind(c) attribute, making it a callable from C again. Using Fortran to bypass restrictions of C would make an interesting addition to the Wikipedia entry on qsort.

2 Likes

That’s a very interesting interface - I am not sure if the interfacing plays a role on some overhead, but I’ve compared it against my own implementation of argsort and the results are quite interesting: with gfortran, the Fortran version is always at least some ~20-30% faster than the C counterpart, see for example here:

module sorting_f
   use iso_fortran_env, only: int32,int64
   implicit none
   private

   integer, parameter, public :: IKIND = int32
   integer, parameter, public :: ISIZE = int64
  
   public :: argsort_f
contains


   !> Argsort: Federico's implementation
   subroutine argsort_f(n,list,keys)
      integer(ISIZE), intent(in)    :: n
      integer(ISIZE), intent(inout) :: list(*) ! indices to the sorted array
      integer(IKIND), intent(in)    :: keys(*) ! Array to be sorted

      integer :: sorted(n)

      sorted = keys(1:n)
      call int_quick_sort_andlist(sorted,list(1:n))

   end subroutine argsort_f

   pure recursive subroutine int_quick_sort_andlist(list,ilist,down)

      integer(IKIND), dimension(:), intent(inout) :: list
      integer(ISIZE), dimension(size(list)), intent(inout) :: ilist
      logical, optional, intent(in) :: down

      integer(ISIZE)     :: i, j, n
      integer(ISIZE), parameter :: max_simple_sort_size = 8
      integer(IKIND)     :: chosen
      logical            :: descending

      descending = .false.; if (present(down)) descending = down
      n = size(list)

      choose_sorting_algorithm: if (n <= max_simple_sort_size) then
         ! Use interchange sort for small lists
         do i = 1, size(list) - 1
            do j = i + 1, size(list)
               if (toBeSwapped(list(i),list(j),.false.)) then
                   call swap(list(i),list(j))
                   call swaplist(ilist(i),ilist(j))
               end if
            end do
         end do

      else
         ! Use partition (quick) sort if the list is big
         chosen = list(int(n/2))
         i = 0
         j = n + 1

         scan_lists: do
            ! Scan list from left end
            ! until element >= chosen is found
            scan_from_left: do
              i = i + 1
              if (toBeSwapped(list(i),chosen,.true.) .or. i==n) exit scan_from_left
            end do scan_from_left

            ! Scan list from right end
            ! until element <= chosen is found

            scan_from_right: do
               j = j - 1
               if (toBeSwapped(chosen,list(j),.true.) .or. j==1) exit scan_from_right
            end do scan_from_right

            swap_element: if (i < j) then
                ! Swap two out of place elements
                call swap(list(i),list(j))
                call swaplist(ilist(i),ilist(j))
            else if (i == j) then
                i = i + 1
                exit
            else
                exit
            endif swap_element

         end do scan_lists

         if (1 < j) call int_quick_sort_andlist(list(:j),ilist(:j),down)
         if (i < n) call int_quick_sort_andlist(list(i:),ilist(i:),down)

      end if choose_sorting_algorithm ! test for small array

      contains

         elemental logical function toBeSwapped(a,b,orEqual)
            integer(IKIND), intent(in) :: a,b
            logical, intent(in) :: orEqual

            toBeSwapped = merge(a<b,a>b,descending)
            if (orEqual .and. a==b) toBeSwapped = .true.

         end function toBeSwapped

         elemental subroutine swap(a,b)
            integer(IKIND), intent(inout) :: a,b
            integer(IKIND) :: tmp
            tmp = a
            a = b
            b = tmp
         end subroutine swap
         elemental subroutine swaplist(a,b)
            integer(ISIZE), intent(inout) :: a,b
            integer(ISIZE) :: tmp
            tmp = a
            a = b
            b = tmp
         end subroutine swaplist

   end subroutine int_quick_sort_andlist

end module sorting_f

module sorting
   use, intrinsic :: iso_c_binding
   implicit none
   private
   public :: argsort
contains

   !> Returns the indices that would sort an array
   !>
   !> Arguments:
   !>   n: number of elements
   !>   list: on entry, list of indices 1..(1)..n 
   !>           (for reverse sort, n..(-1)..1)
   !>         on return, the indices such that keys(list) would be sorted
   !>   keys: array to be sorted
   !>
   subroutine argsort(n,list,keys)
      integer, intent(in) :: n
      integer, intent(inout), target :: list(*)
      integer, intent(in) :: keys(*)

      interface
         subroutine qsort(ptr,count,size,comp) bind(c,name="qsort")
            import c_ptr, c_int, c_size_t, c_funptr
            implicit none
            type(c_ptr), intent(in), value :: ptr
               !> pointer to the array to sort
            integer(c_size_t), intent(in), value :: count
               !> number of elements in the array
            integer(c_size_t), intent(in), value :: size
               !> size of each element in the array in bytes
            type(c_funptr), intent(in), value :: comp
               !> comparison function which returns ​a negative integer value 
               !> if the first argument is less than the second, a positive 
               !> integer value if the first argument is greater than the 
               !> second and zero if the arguments are equivalent
         end subroutine
      end interface

      if (n < 2) then
         ! list already sorted
         return
      end if

      call qsort(c_loc(list),int(n,c_size_t),c_sizeof(list(1)), &
         c_funloc(argsort_compare))

   contains
      ! technically, the inputs should be void pointers
      ! but works nevertheless
      integer(c_int) function argsort_compare(a,b) bind(c)
         integer(c_int), intent(in) :: a, b
         argsort_compare = keys(a) - keys(b)
      end function
   end subroutine

end module sorting

program main
   use iso_c_binding, only: c_int
   use sorting
   use sorting_f
   implicit none

   integer(ISIZE), parameter :: N = 100000

   real, allocatable :: x(:)
   integer(ISIZE), allocatable :: listf(:)
   integer(IKIND), allocatable :: keys(:),list(:)
   integer(ISIZE) :: j
   real :: t0,t1,t2

   allocate(list(N),keys(N),x(N))
   call random_number(x)
   keys(:) = nint(N*x)
   list(:) = [(j,j=1,N)]
   listF   = list

   call cpu_time(t0); call argsort(int(N,c_int),list,keys)
   print *, keys(1:5),list(1:5)
   call cpu_time(t1); call argsort_f(N,listf,keys)
   print *, keys(1:5),list(1:5)
   call cpu_time(t2)

   ! Supposed to print [ 3, 5, 6, 7]
   print *, 'C time=',t1-t0
   print *, 'F time=',t2-t1

end program