I’m trying to port Donald Knuth’s fast bit reversal algorithm from C to Fortran, as described in Chapter 7 of Hacker’s Delight, 2nd edition, by Henry S. Warren.
The original algorithm looks like this for 32 bits:
unsigned rev(unsigned x) {
unsigned t;
x = (x << 15) | (x >> 17); // Rotate left 15.
t = (x ^ (x>>10)) & 0x003F801F; x = (t | (t<<10)) ^ x;
t = (x ^ (x>> 4)) & 0x0E038421; x = (t | (t<< 4)) ^ x;
t = (x ^ (x>> 2)) & 0x22488842; x = (t | (t<< 2)) ^ x;
return x;
}
And the following for 64 bits:
unsigned long long rev(unsigned long long x) {
unsigned long long t;
x = (x << 32) | (x >> 32) // Swap register halves
x = (x & 0x0001FFFF0001FFFFLL) << 15 | // Rotate left
(x & 0xFFFE0000FFFE0000LL) >> 17; // 15.
t = (x ^ (x >> 10)) & 0x003F801F003F801FLL;
x = (t | (t << 10)) ^ x;
t = (x ^ (x >> 4)) & 0x0E0384210E038421LL;
x = (t | (t << 4)) ^ x;
t = (x ^ (x >> 2)) & 0x2248884222488842LL;
x = (t | (t << 2)) ^ x;
return x;
}
This is what I’ve come up with so far:
INTEGER FUNCTION bit_rev_knuth_32(i_,number_bits)
IMPLICIT NONE
! Input argument
INTEGER, INTENT(IN) :: i_, number_bits
! Intrinsic functions
INTRINSIC :: IOR, IAND, IEOR, ISHFT
! Internal variables
INTEGER :: x, t
! Copy input to x
x = i_
x = IOR( ISHFT(x,15), ISHFT(x,-17) ) ! Rotate left 15
t = IAND( IEOR( x, ISHFT(x,-10) ), z'003F801F' ) ! t = (x ^ (x >> 10)) & 0x003F801F
x = IEOR( IOR( t, ISHFT(t, 10) ), x ) ! x = (t | (t << 10)) ^ x
t = IAND( IEOR( x, ISHFT(x,-4) ), z'0E038421' ) ! t = (x ^ (x >> 4)) & 0x0E038421
x = IEOR( IOR( t, ISHFT(t, 4) ), x ) ! x = (t | (t << 4)) ^ x
t = IAND( IEOR( x, ISHFT(x,-2) ), z'22488842' ) ! t = (x ^ (x >> 2)) & 0x22488842
x = IEOR( IOR( t, ISHFT(t, 2) ), x ) ! x = (t | (t << 2)) ^ x
! Return x shifted to the correct bit length
bit_rev_knuth_32 = ISHFT(x,(number_bits-32))
RETURN
END FUNCTION bit_rev_knuth_32
INTEGER(KIND=int64) FUNCTION bit_rev_knuth_64(i_,number_bits)
IMPLICIT NONE
! Input argument
INTEGER(KIND=int64), INTENT(IN) :: i_, number_bits
! Intrinsic functions
INTRINSIC :: IOR, IAND, IEOR, ISHFT
! Internal variables
INTEGER(KIND=int64) :: x, t
! Copy input to x
x = i_
x = IOR( ISHFT(x,32), ISHFT(x,-32) ) ! Swap register halves
x = IOR( ISHFT( IAND( x, z'0001FFFF0001FFFF' ), 15 ), & ! Rotate left 15
ISHFT( IAND( x, z'FFFE0000FFFE0000' ),-17 ) )
t = IAND( IEOR( x, ISHFT(x,-10) ), z'003F801F003F801F' ) ! t = (x ^ (x >> 10)) & 0x003F801F003F801FLL
x = IEOR( IOR( t, ISHFT(t, 10) ), x ) ! x = (t | (t << 10)) ^ x
t = IAND( IEOR( x, ISHFT(x,-4) ), z'0E0384210E038421' ) ! t = (x ^ (x >> 4)) & 0x0E0384210E038421LL
x = IEOR( IOR( t, ISHFT(t, 4) ), x ) ! x = (t | (t << 4)) ^ x
t = IAND( IEOR( x, ISHFT(x,-2) ), z'2248884222488842' ) ! t = (x ^ (x >> 2)) & 0x2248884222488842LL
x = IEOR( IOR( t, ISHFT(t, 2) ), x ) ! x = (t | (t << 2)) ^ x
! Return x shifted to the correct bit length
bit_rev_knuth_64 = ISHFT(x,(number_bits-64))
RETURN
END FUNCTION bit_rev_knuth_64
I have the following concerns:
- Fortran does not have unsigned integers. I think the current translation can handle bit sizes up to 31 bits (for the 32-bit version) and up to 63 bits (for the 64-bit version). How can I handle bit lengths of exactly 32 bits and exactly 64 bits?
- Would converting
ISHFT
intrinsics toSHIFTL
andSHIFTR
be beneficial for speed/optimization reasons? These two intrinsics are introduced in Fortran 2008, and MFE seems to favor them overISHFT
, but looks like not all compilers support them yet… notably NVIDIA HPC SDK, which I extensively use for its stellar OpenACC support. - Eventually, I would want to convert these into GPU kernels. I wonder what is the most efficient way to perform bitwise operations on the GPU… Right now I’m leaning towards OpenACC, since it’s the GPU programming model that I’m most familiar with, but I am willing to consider OpenMP target offload or CUDA Fortran too.
Any help would be appreciated.