Transferring bits with different integer KINDs

I have a question about how to transfer bits from one integer to another integer of a different kind. Let me use int64 and int32 as an example, but I would really like to know how to best do this for arbitrary integer kinds. So suppose I have some bits stored in the int64 variable i64, and I want to extract the low-order 32 bits and place them into the int32 variable i32.

The best and clearest way to do this should be something like

call mvbits( i64, 0, 32, i32, 0 )

Unfortunately, for some unknowable reason, that is not allowed. The mvbits() intrinsic requires the KIND values of the two integer variables to be the same. Ok, so there are lots of other bit operators, so lets look for other clear approaches. The f77 way to do this is with the equivalence hack.

integer(int64) :: i64
integer(int32) :: j(2), i32
equivalence (i64,j)
i64 = ...whatever...
i32 = j(1)

There are two problems with this. First is that this only works for little-endian addressing. On a big-endian machine the last statement should be int32 = j(2). So you can add some code to test for endian conventions, and then finally get the right assignment. Ok, that works, but it is no longer simple and clear. The other problem is that equivalence() is an obsolete feature. The NAG compiler, for example, will not compile the code without special options to enable the feature and to disable the error messages. So none of that is clean.

You can also remove the equivalence statement and use transfer.

j = transfer( i64, j )
i32 = j(1)

but the same endian problems arise with the last assignment. So no longer clear and simple, and why should the programmer need to worry about endian addressing conventions anyway. He knows which bits he wants, and he knows where to put them, so the endian addressing seems like an unnecessary distraction anyway.

So lets try another approach. the expression ibits(i64,0,32) gets the right bits. In fact, the programmer can extract any number of bits from any location that way, This can also be done with combinations of left and right shifts, which is probably what ibits() does anyway. The problem here is that the result of the ibits() function is the same type as its first argument, so the result is int64 in this case. So when the result is a positive value less than huge(i32), the simple assignment

i32 = ibits(i64,0,32)

will work. But if bit 31 is set, then the value of ibits() exceeds the range of the int32 variable on the lhs, and then the assignment is undefined (or if overflows are trapped, it can cause a run time exception). So that then leads to multistep assignments, where each of the intermediates is small enough to avoid the overflow problem. Of the several possibilities, here is one example.

i32 = ibits(i64,0,31)
if ( btest(i64,31) ) i32 = ibset(i32,31 )

I think that works, and is portable and standard conforming, but it seems both complicated and inefficient.

I experimented a little with these various possibilities, and with the gfortran and flang compilers I found that the simple assignment

i32 = ibits(i64,0,32)

does in fact work, even if bit 31 is set. In fact, the assignment i32=i64 effectively just moves the low-order bits into i32, ignoring the high-order bits. The same thing seems to occur with other bit operators where the result is int64 and bit 31 is set. That is a convenient feature for doing this kind of bit manipulation (and maybe an inconvenient feature for other purposes), but I do not think that behavior is specified by the standard, so I wonder how portable it is across compilers or with various compiler options (particularly those that enable tests for overflows during assignments.

The programmer might try to do something a little more explicit

i32 = int( ibits(i64,0,32), int32 )

but I think this has the same undefined behavior regarding overflows. That is just an explicit way to do what the automatic conversion rules are doing anyway. F2023 section 16.9.110 does not give any further guidance on this issue.

I think in principle this could be fixed in the standard with an extra argument

i32 = ibits(i64,0,32, kind=int32)

just as mvbits(i64,0,32,i32,0) could be fixed in the standard to work correctly, but that kind of change could take five or ten years.

So my question is what is the best approach for a programmer to use when moving bits between integers of different KINDs? I think this has been an issue since f90, so for some 35 years now. I guess this doesn’t bother enough people to have been of any concern all that time.

I think an intrinsic function to test for endianess would also be useful. Not hard to write your own but like a lot of little things like this it would be better (IMHO) if it was an intrinsic.

Here is a little program that demonstrates what happens with the assignment i32=i64.

program overflow
   use, intrinsic :: iso_fortran_env, only: int32, int64
   integer(int64) :: i64
   integer(int32) :: i32

   i64 = (2_int64**62 / 3) * 4 + 1
   call printit('00')
   i64 = ibset( i64, 63 )
   call printit('10')
   i64 = ibset( i64, 31 )
   call printit('11')
   i64 = ibclr( i64, 63 )
   call printit('01')
contains

   subroutine printit(ch)
      character(*), intent(in) :: ch
      character(*), parameter :: c64 = '(a,1x,a,b64.64,1x,i0)', c32 = '(a,1x,a,b32.32,1x,i0)'
      print c64, ch, 'i64=', i64, i64
      i32 = i64
      print c32, ch, 'i32=', i32, i32
      return
   end subroutine printit
end program overflow

The output with gfortran, flang, and nagfor is:

$ gfortran overflow.f90 && a.out
00 i64=0101010101010101010101010101010101010101010101010101010101010101 6148914691236517205
00 i32=01010101010101010101010101010101 1431655765
10 i64=1101010101010101010101010101010101010101010101010101010101010101 -3074457345618258603
10 i32=01010101010101010101010101010101 1431655765
11 i64=1101010101010101010101010101010111010101010101010101010101010101 -3074457343470774955
11 i32=11010101010101010101010101010101 -715827883
01 i64=0101010101010101010101010101010111010101010101010101010101010101 6148914693384000853
01 i32=11010101010101010101010101010101 -715827883

So the assignment appears to simply copy the low-order 32 bits while ignoring the high-order bits. However, I don’t think this behavior is prescribed in the standard. These results were obtained on an arm64 machine in little-endian mode (MacOS). There is also the question of how this code behaves on a big-endian machine.