Moving Bits Question

RonShepard · November 29, 2022, 3:16am

Perhaps this is a matter of perspective, but the high bit in a ones-complement integer is the sign bit. It is 0 for positive integer values and 1 for negative integer values. Of course there is a positive zero value, 000…000, and a negative zero value, 111…111, so those either do or don’t count, depending on what exactly a value of zero means at the moment. I don’t know of any modern machines that use ones-complement arithmetic, so I cannot verify what exactly an expression like int(i32,int64) does when i32 is one of those zero values, but at least it looks like the sign bit would get propagated during the kind conversion.

dwwork · November 29, 2022, 4:17am

I’ve gotten lost in the different threads of this discussion. Do I understand your issue correctly? You want to pack 2 32-bit signed integers into a single signed 64-bit integer, and then later retrieve those 2 32-bit signed integers on the same machine (so no cross-endianess issues?)?

If I understand your situation correctly, then I feel the solution is quite simple:

function i32_to_i64(i1,i2)
    integer(int32), intent(in) :: i1,i2
    integer(int32) :: tmp(2)
    integer(int64) :: i32_to_i64
    tmp(1) = i1
    tmp(2) = i2
    i32_to_i64 = transfer(tmp,i32_to_i64)
end function i32_to_i64

At this point, the 64-bit integer doesn’t mean anything, other than as a carrier for those two smaller ints.

Then you can get them back out with the reverse operation:

function i64_to_i32(i64)
    integer(int64), intent(in) :: i64
    integer(int32) :: i64_to_i32(2)
    i64_to_i32 = transfer(i64,i64_to_i32)
end function i64_to_i32

But…I feel like I’m not understanding something about your problem which renders it not so simple.

FortranFan · November 29, 2022, 4:38am

@dwwork, you’re not the only one, that’s why I wrote in this comment about OP @RonShepard needing to illustrate the aspects that are all of interest to the issue on hand, now that readers all have had their fill about being humored with many tangents in the discussion!

Note TRANSFER has been brought up from the original post up to the one just prior to yours. OP’s concerns with TRANSFER appear based on the long complaint with its use that its specification in the standard leaves a bit too much as processor-dependent and thus it can prove inadequate if one were to strive for portable code. I can buy that argument as a generic complaint against TRANSFER since it allows conversion across different types. But in this specific context involving integer types, it makes no sense, thus OP should provide better illustration of any concerns with TRANSFER.

Hopefully your post will encourage OP to answer your questions and also to let on with the crucial details, Or at least point to the relevant code online that readers can review and suggest standard-conforming, portable options,

dwwork · November 29, 2022, 5:35am

Being mostly a C++ guy now, transfer seems to me to just be a memcpy(). Perhaps there are some subtle difference I am unaware of though.

In fact, this is the only legal way to convert between types in C++. (union type-punning and reinterpret_casting are considered “undefined behavior”), and this function that I wrote in Fortran has been enshrined in C++20 as std::bit_cast. In C++20, this looks like:

int64_t i32_to_i64(int32_t i0, int32_t i1) {
    int32_t tmp[2] = {i0, i1};
    return std::bit_cast<int64_t>(tmp);
}

The compilers in fact recognize this idiom so well, they don’t actually do the memcpy() at all! Kinda dumb, but there it is.

RonShepard · November 29, 2022, 7:20am

I agree that C memcpy() is effectively the same as fortran transfer(). My legacy fortran code uses a variety of kludges to solve this problem, including type punning and equivalence of different types. Those things are now obsolete and/or deleted from the fortran standard, so I’m trying to eliminate them from the codes. I’m working on this part of the code for other reasons, so I thought now might be a good time to just convert everything to modern fortran (meaning the standard bit operators, no equivalence, no type punning). Fortran has a standard definition of the bits within an integer, and it uses that definition of its integers to define its bit operators. That definition avoids byte addressing conventions, so in principle it should allow the programmer to write portable code that works with either addressing convention. But in practice, it seems that isn’t really possible after all. In my earlier post were I was explaining the general problem, I said

My general problem is to work with strings of bits of arbitrary lengths. A bit string data type would also solve all of these issues, but that doesn’t exist in the standard. So I’m doing things with int64 arrays the hard way. If I store two 10-bit strings in adjacent locations, then I want to be able to retrieve them back as one 20-bit string in a consistent way. All of this endianess ambiguity gets in the way of doing that in a portable way. I suspect the code I have working now on my little-endian machine will not work on a big-endian machine.

Since posting that, I now have localized the endian addressing parts of the problem, and I have the other parts of the code working in what I think is a portable way. I’ll know for certain after I compile and run the codes on a big-endian machine.

I also observed that if the intrinsic mvbits() subroutine were generalized, just slightly, then that would effectively solve all of these addressing convention problems. That routine already has all the information it needs to work, so this change could be done in a backwards compatible way that does not break old code.

For the 32-bit field case, which should be a simple case since there are intrinsic 32-bit integers, the problem is how do you extract the low-order 32 bits of a 64-bit integer? That is, what is equivalent to

call mvbits( i64, 0, 32, i32, 0 )

That statement will not compile because i64 and i32 are different kinds, but that is the operation that I’m asking about (that and its reverse, inserting 32 bits from i32 into the low-order bits of i64). My general problem is to extract arbitrary strings from i64 into arbitrary positions into i32. This could be written as

call mvbits( i64, ipos, len, i32, jpos )

It is disappointing that this operation is not allowed in this simple form.

msz59 · November 29, 2022, 9:42am

I fully agree that the specification of mvbits could and should allow for different kinds of source and destination.

As for transfer being equivalent to C/C++ memcpy() it seems to hold only if the bit-size of the result is less or equal to that of the source. If the result is longer than the source, the latter is copied to the leading part of result but the remainder is processor dependent. memcpy() would never modify any memory outside the block being copied. Here, the philosophy behind transfer is completely different, as there is no destination argument. The function has to provide as many bytes as the size of the result is, so if the source is shorter, the remaining bytes come out of nowhere. They could be put to zero by the requirement of the Standard, but it is not so.

Again, it is regrettable that equivalence has been declared obsolescent. From the discussion above it seems that modern Fortran does not have a full, nomen omen, equivalent of that statement.

everythingfunctional · November 29, 2022, 3:14pm

Seems like the following solves your problem no?

integer(int64) :: store
integer(int32) :: tmp(2)
integer(int32) :: val_in, val_out

tmp(1) = val_in
store = transfer(tmp, store)
...
tmp = transfer(store, tmp)
val_out = tmp(1)

I mean, you don’t actually know whether the value was stored in the high or low order bits, but it doesn’t matter, it comes back out deterministically.

msz59 · November 29, 2022, 3:30pm

This would not be possible using transfer without going into details of endianess.

FedericoPerini · November 29, 2022, 4:13pm

…in a one liner. But if as little as 4 commands are enough it can be accomplished like:

    elemental subroutine mvbits_64_32(from, frompos, len, to, topos)
        integer(int64), intent(in) :: from
        integer, intent(in) :: frompos,len,topos
        integer(int32), intent(inout) :: to

        integer(int64) :: one_64
        integer(int32) :: two_32(2)

        ! Make 2 copies of the transferred bits, not checking that len<=32!!!!!
        call mvbits(from,frompos,len,one_64,0)
        call mvbits(from,frompos,len,one_64,32)
        two_32 = transfer(one_64,two_32)
        call mvbits(two_32(1),0,len,to,topos)

    end subroutine mvbits_64_32

Test it this way:

program test_mvbits
   use iso_fortran_env
   implicit none

   integer(int64) :: i64
   integer(int32) :: i32

   real :: x,y,z
   integer :: i,len,from,to,bit
   integer, parameter :: NTEST = 1000000


   do i=1,NTEST

       ! Generate a random 64-bit number
       call random_number(x); x = x-0.5
       i64 = nint(x*huge(i64))

       ! Transfer a random number of bits
       call random_number(y)
       len = nint(y*32)

       ! At a random location. Ensure from+len never exceds 64, but do go across the 32rd bit
       call random_number(z)
       from = nint(z*31)

       ! Copy to a random location (shorter)
       call random_number(z)
       to = nint(z*12)

       ! ensure to+from does not exceed 32 bits
       len = min(len,32-to)

       call mvbits_64_32(i64,from,len,i32,to)

       ! perform check
       do bit=1,len
          if (btest(i64,from-1+bit).neqv.btest(i32,to-1+bit)) then
             print '(b64.64)', i64
             print '(b32.32)', i32
             print *, 'from=',from,' len=',len,' to=',to
             stop 'catastrophic error'
          end if
       end do

   end do

   print *, NTEST,' mvbit tests passed!'

    contains

    elemental subroutine mvbits_64_32(from, frompos, len, to, topos)
        integer(int64), intent(in) :: from
        integer, intent(in) :: frompos,len,topos
        integer(int32), intent(inout) :: to

        integer(int64) :: one_64
        integer(int32) :: two_32(2)

        ! Make 2 copies of the transferred bits, not checking that len<=32!!!!!
        call mvbits(from,frompos,len,one_64,0)
        call mvbits(from,frompos,len,one_64,32)
        two_32 = transfer(one_64,two_32)
        call mvbits(two_32(1),0,len,to,topos)

    end subroutine mvbits_64_32


end program test_mvbits

RonShepard · November 29, 2022, 4:34pm

Yes, I agree that those details differ. memcpy() works with pointers to memory that must already be available, whereas transfer can, and sometimes does, allocate its own memory.

Regarding mvbits(), it has the odd feature that the from and to arguments can be the same, or more generally can be associated (i.e. can have overlapping memory). Usually when an argument is modified (as the intent(inout) to argument is), then it cannot be aliased to any other dummy argument. I supposed a copy is made behind the curtain, in order to allow the subroutine to work that way. However, in the specification, it says simply that from is intent(in), it does not mention the value attribute. Of course, a subroutine like this would be expected to work entirely with registers and to not make any memory references until the very end when the to argument has been modified, so that is basically the same as the value attribute anyway.

RonShepard · November 29, 2022, 4:46pm

I had not thought of making two copies. That clever, ugly, hack does work!

FedericoPerini · November 29, 2022, 4:49pm

Clever hacks are actually nice! a small optimization could be that if the number of bits you’re transferring is small (say len<=4), you could revert to a small merge(ibset(..),ibclr(..),btest(..)) loop that’s certainly going to be unrolled.

RonShepard · November 29, 2022, 5:05pm

I think transfer() can replace equivalence in this context of copying bits. The problem with both transfer() and equivalence is that they sit on top of the underlying byte addressing convention. To test for byte ordering, the modern way is something like:

logical, parameter :: little_endian = transfer(1_int64,1_int32) == 1_int32

That is pretty much the same as the 1970s legacy code:

integer*8 i64
integer*4 i32(2)
equivalence(i64,i32)
i64 = 1
little_endian = i32(1) == 1

In the latter case, one would expect the compiler to do most or all of the work at comile time rather than run time, so the practical difference of then and now are minimal. Of course, the modern code is shorter, and arguably esthetically better, but it is still just a way to handle what should be an unnecessary task of accounting for low level details of byte addressing conventions.

msz59 · November 29, 2022, 5:34pm

The problem is that both transfer and now-obsolecent equivalence are based on storage sequence, that is treat objects as a sequence of bytes, somewhat ignoring the endianess. On the other side, mvbits works on integer values (defined as a sequence of bits) and thus is endianess-aware. So it is risky to mix transfer with mvbits, as did @FedericoPerini in his very smart trick. But please note, we call it a trick and I am afraid it might not work on a hypothetical machine with mixed-endianess, as PDP-11 was back in 1970s or 1980s, then downscaled to 2- and 4-byte words.

Allowing mvbits to operate on different kinds/lengths of integers would provide a robust, portable solution to the OP’s problem, eliminating any need to make tricks, smart as they are.

msz59 · November 29, 2022, 5:54pm

I agree that it is possible but requires much more effort. Suppose we need to test the LS bit of the second byte of an 32-bit integer object (second byte in memory!) and if zero, set the second bit of the third byte. Using equivalence it takes one statement (not counting declarations):

integer(int32) :: object
integer(int8) :: obj(4)
equivalence(object, obj)
! [...]
if ( .not. btest(obj(2),0) ) obj(3)=ibset(obj(3),2)

to make it non-obsolescent, you have to add two transfer statements

integer(int32) :: object
integer(int8) :: obj(4)
! [...]
obj = transfer(object, obj)
if ( .not. btest(obj(2),0) ) obj(3)=ibset(obj(3),2)
object = transfer(obj, object)

FortranFan · November 29, 2022, 6:00pm

Why reinvent the wheel when Fortran stdlib has stdlib_bitsets module with bitset_64 type that would allow you to work (read/write) with any subset of the 64 bits?

msz59 · November 29, 2022, 9:26pm

Maybe I have missed something but browsing the specification I could not find any procedure to read/write 64-bit bitset from/to a 64-bit integer, something that OP would probably find interesting. The module seems to provide means to R/W bitsets from/to character literals, formatted/unformatted files and logical arrays (strangely having intNN kinds), but not integers.

FortranFan · November 29, 2022, 10:11pm

See upthread: “should anything be lacking or running into bugs, suitable PR s would be in order to get you the rest of the missing functionality. You may even consider authoring them yourself!”

RonShepard · November 29, 2022, 10:49pm

msz59:

For a “fair” comparison of the equivalence to the transfer(), one would need to look at something like:
integer(int32), intent(inout) :: value
integer(int32) :: object
integer(int8) :: obj(4)
equivalence(object, obj)
! [...]
object = value
if ( .not. btest(obj(2),0) ) obj(3)=ibset(obj(3),2)
value = object

This is because equivalence cannot be done on dummy arguments, so one almost always needs to copy the dummy argument into the local equivalenced structure, do the work, and then copy it back to the dummy argument. Of course this would not be required if working with local or common block entities, which could be equivalenced. In contrast, transfer() does work with dummy arguments, along with local, module, or heaven forbid, common block variables. Once you add the assignments, it looks almost identical to the transfer() code, and in fact I would expect that the compiled instructions would be nearly identical for the two cases.

dwwork · November 30, 2022, 1:42am

In fact, they are identical: Compiler Explorer

At least with gfortran

Topic		Replies	Views
How to write bytes in a binary file? Help	11	2606	March 1, 2021
Unsigned integer data type in modern Fortran	4	2387	May 25, 2021
How to perform sign-extension? Help	17	842	July 15, 2023
Bit manipulation in fortran Help	11	621	March 1, 2024
Initialize 64-bit integer variables with 64-bit real values Help	7	1980	March 11, 2022

Moving Bits Question

Related topics