Moving Bits Question

I am rewriting some old codes from the 1970s and 1980s that pack/unpack and move bits around. Back then, none of the bit operators that we used were standard, so we have typically 5-10 #if blocks of conditional code to handle all of the compiler-specific versions. We sometimes use EQUIVALENCE with LOGICAL*1 bytes and INTEGER*2, INTEGER*4 and INTEGER*8, and also REAL*8 for machines that did not support INTEGER*8. Fortran 77 and earlier was very limited in this respect, so we had to work to make things as consistent, clear, portable, and efficient as possible.

However, since f90, the usual bit operations have been standardized, and since f2008 we now have the INT8, INT16, INT32, and INT64 integer kinds (with INT64 being required, I think), so now we have more to work with than we did 40 years ago. We still do not have an intrinsic bit string type, which would make all of this simple and portable, but we do have more tools in the standard than we used to. So one required operation is to pack 32-bit integer values into an array of 64-bit integer containers. My first thought is something like

   integer(int32) :: i32
   integer(int64) :: i64
   ...
   call mvbits( i32, 0, 32, i64, ipos )
   ...
   call mvbits( i64, ipos, 32, i32, 0 )

where ipos == 0 or 32 as appropriate for the i64 destination. That syntax seems simple and clear and it avoids all the complications of the sign bit – the bits should just be copied with no interpretation or scaling or sign extending or anything. But of course for those who do this all the time, you immediately recognize the problem that the mvbits() intrinsic requires the from and to arguments to be of the same KIND. Ok, so then I think

call mvbits( int(i32,int64), 0, 32, i64, ipos )

should do the trick. That works for positive i32 values, but that doesn’t work when the sign bit is set. So then I think

call mvbits( transfer(i32,i64), 0, 32, i64, ipos )

should work. On my computer, the transfer() leaves the most significant 32 bits in the from argument unspecified, but that doesn’t matter because mvbits() is not referencing them.

This actually works on my computer, so maybe I should just stop while I’m ahead. But here is the problem. The transfer() operation just moves the bits, it doesn’t really define where those bits get moved, in contrast to the IOR(), IAND(), SHIFTL(), and so on bit operators. The programmer isn’t really supposed to use the results directly from transfer(), it is just supposed to hold the bits for a later transfer() operation that reverses the effect. So specifically in this case, the 32 bits that I’m working with might conceivably end up in either the low-order bits or the high-order bits of the transfer() result. So now I’m back to writing machine-dependent code like

call mvbits( transfer(i32,int64), jpos, 32, i64, ipos )

where jpos==0 when transfer moves the bits to the low-order part of the result and jpos==32 when the bits are in the high-order part of the result. Since this whole exercise is to eliminate machine-dependent code, that uncertainty kind of defeats the purpose. BTW, this low/high problem was there in the old 1970s code too when we used equivalence to move the bits around.

integer*4 i32(2)
integer*8 i64
equivalence (i32,i64)
 ...
i32(jpos) = bits_to_be_placed

If I want the bits placed in the low-order bits of i64, then on little-endian machines, jpos should be 1, while on big-endian machines, jpos should be 2. In our legacy code, this was sometimes done by testing a runtime expression, and it was sometimes done with #if conditional compilation. Either way, it is an ugly mess.

Ok, so maybe I should just avoid mvbits(). I could go back to shifting and masking the bits with SHIFTL, SHIFTR, IOR, etc. instead. For the 32-bit field width case

i64 = 0_int64
   ...
i64 = IOR(i32,i64)

That should put those bits in the right place, with no complications due to signed integers or anything like that. But again, there is the restriction that the two arguments in IOR should be of the same kind (understandable in this case), so this does not work. So I could go through the same steps as above with

i64 = IOR(int(i32,int64),i64)

or

i64 = IOR(transfer(i32,i64),i64)

to get the kinds to match, but now I’m back to the sign problem in the first case and to the machine-dependent transfer() problems in the second case. On my computer, the transfer operation actually places the bits in the right place, but it seems to leave random noise in the high-order bits, so something like

i64 = IOR(ibits(transfer(i32,i64),0,32),i64)

or the equivalent masking operatiion with IAND() is actually required. That seems way too complicated for the simple operation that I want to do, and of course in the end, it is still machine-dependent exactly where transfer() puts those bits within the result.

So am I overlooking something here. Is there a simple way to place 32 bits from i32 into the low-order bits of i64? If not, than all of this could be avoided if the original expression that I tried

call mvbits( i32, 0, 32, i64, ipos )

with ipos==0 would have worked. That syntax specifies clearly and unambiguously where the bits are coming from and where they are going to. Why oh why did the standard committee screw us over by requiring the from and to arguments to have the same KIND in mvbits()? And what about a bit string data type, which would have solved all of this once and for all 40 years ago.

This issue also gets back to the “eating your own dog food” issue that came up in a different thread recently. If compiler writers using fortran had encountered all of these complications 40 years ago, then they would have immediately changed the language to eliminate the problem. All this would have never been an issue in the language.

1 Like

There is also the reverse problem. Suppose you have somehow managed to get the bits packed into the low-order part of i64, then how do you get them back out into i32?

i32 = ibits(i64,0,32)

works alright when bit# 31 is not set, but what happens when it is set? If the compiler just ignores the integer overflow in the assignment, then all is good, but if it detects the overflow then it could throw an exception and abort your job. Something like

i32 = transfer(i64,i32)

or

i32 = transfer( ibits(i64,0,32), i32 )

might work, but this again gets back to the ambiguity of exactly how transfer places bits. It seems like the best solution would be

call mvbits( i64, 0, 32, i32, 0 )

because that specifies exactly where the bits are coming from and going to. But mvbits() has the KIND restriction that prevents this.

Try out the bitset type in the Fortran stdlib effort:

William Clodius, a long-time computational scientist and also a contributor to many a discussion over time at comp.lang.fortran, is the primary author of the above bitsets module in stdlib. That will hopefully serve as a positive reference for you.

And I think this work on stdlib bitsets and the methods therein can meet most of your needs; should anything be lacking or running into bugs, suitable PRs would be in order to get you the rest of the missing functionality. You may even consider authoring them yourself!

3 Likes

If you want to avoid dealing with endianness, maybe you could do:

! copy an int32 to the first 32 bits of an int64
elemental integer(int64) function bit32_to_64(i32) result(i64)
   integer(int32), intent(in) :: i32

   integer :: bit
  
   i64 = 0 
   do bit=0,31
       i64 = merge(ibset(i64,bit),ibclr(i64,bit),btest(i32,bit))
   end do
end function 

and the other way around. Not-so-ugly and hopefully conceptually clear?

1 Like

I had also thought of

i64 = 0_int64
do j = 0, 31
       if( btest(i32,j) i64 = ibset(i64,j)
enddo

That is standard conforming, but it seems inefficient and inelegant. Is a loop over the individual bits really the best that fortran has to offer?

I was hoping to avoid all endianess issues by adopting a standard convention and using the standard fortran bit operators.

The transfer() operations work on my machine, but when I read the standard it seems like that is just by coincidence. The standard refers to “leading parts” of the source and destimation arguments, but I don’t see anywhere where the leading parts of integers are defined. Thus I suspect that it processor dependent, and is the same old endianess issue just hidden by a layer of abstraction.

I have individual integers. I want to store them in a bit bucket (an int64 array). And then I want to be able to retrieve or modify an individual value from that bit bucket. If I could control where the bits go into each int64 element (or elements, when they cross boundaries), then I could avoid any endian issues.

If mvbits() were a little more flexible, then that would solve the problem. But as I found out, it isn’t. As it is now defined, it just sits one level above the problem.

If the “leading parts” of integers were defined, then I could use transfer() to solve the problem. But I don’t think that is the case, so transfer() also sits one level above the problem.

My general problem is to work with strings of bits of arbitrary lengths. A bit string data type would also solve all of these issues, but that doesn’t exist in the standard. So I’m doing things with int64 arrays the hard way. If I store two 10-bit strings in adjacent locations, then I want to be able to retrieve them back as one 20-bit string in a consistent way. All of this endianess ambiguity gets in the way of doing that in a portable way. I suspect the code I have working now on my little-endian machine will not work on a big endian machine.

@RonShepard and any other readers interested in something like this,

Please consider working with stdlib and introducing some general facilities in there that all the Fortran practitioners can make use of.

Also, avoid copying data as much as possible in such a facility.

In addition to the bitsets module I suggested upthread, if something additional is required for a specific need, here is another custom type to consider as merely a crude starting point. Based on a “NY minute” worth of thought, methinks the fields in such a “bucket” should be the length of a colloquial “byte” i.e., 8 bits. Further insight might change this thinking. Anyways, here it is: how is it for the “best” Fortran has to offer, does this pass muster?

! File p.f90
module bitbucket_m
   
   use, intrinsic :: iso_c_binding, only : I1 => c_int8_t, I4 => c_int32_t, I8 => c_int64_t, c_loc, &
      c_f_pointer
      
   private

   type, public :: bitbucket_t
   ! Field of 8-bit integers
      private
      integer(I1) :: m_fields(8) = 0_i1
   contains
      private
      procedure, pass(this) :: zap_fields
      procedure, pass(this), public :: int32 => get_int32
      procedure, pass(this), public :: int64 => get_int64
      generic, public :: zap => zap_fields
   end type bitbucket_t

   type :: pos_t
      integer :: pos = 0
   end type
   type(pos_t), public, parameter :: POS0 = pos_t() 
   type(pos_t), public, parameter :: POS32 = pos_t( pos=32 ) 
   
contains

   subroutine zap_fields( this, idx )
      ! Argument list
      class(bitbucket_t), intent(inout) :: this
      integer, intent(in)               :: idx(:)
      ! Local variable
      this%m_fields( idx ) = 0_i1
      return
   end subroutine zap_fields
   
   function get_int32( this, pos ) result(pvals)
      ! Argument list
      class(bitbucket_t), intent(in), target :: this
      type(pos_t), intent(in)                :: pos
      ! Function result
      integer(I4), pointer :: pvals
      pvals => null()
      select case ( pos%pos )
         case ( 0 ) 
            call c_f_pointer( cptr=c_loc(this%m_fields(1)), fptr=pvals )
         case ( 32 ) 
            call c_f_pointer( cptr=c_loc(this%m_fields(3)), fptr=pvals )
      end select
      return
   end function get_int32
   
   function get_int64( this ) result(pval)
      ! Argument list
      class(bitbucket_t), intent(in), target :: this
      ! Function result
      integer(I8), pointer :: pval
      call c_f_pointer( cptr=c_loc(this%m_fields), fptr=pval )
      return
   end function get_int64
   
end module
   use, intrinsic :: iso_fortran_env, only : int32, int64
   block 
      integer(int32) :: i32
      integer(int64) :: i64
      print *, "Block 1: using copy assignment with a type conversion function"
      i32 = -123456
      i64 = bit32_to_64( i32 )
      print "(*(g0))", "i32  = ", i32
      print "(*(g0))", "i64  = ", i64
      print "(g0,b0)", "i32 (bit pattern): ", i32
      print "(g0,b64)","i64 (bit pattern): ", i64
   end block
   print *
   block 
      use bitbucket_m, only : bitbucket_t, POS0
      type(bitbucket_t) :: bucket
      print *, "Block 2: using a custom type"
      bucket%int32( POS0 ) = -123456
      print "(*(g0))", "bucket%int32( POS0 )  = ", bucket%int32( POS0 )
      print "(*(g0))", "bucket%int64()  = ", bucket%int64()
      print "(g0,b0)", "bucket%int32( POS0 ) (bit pattern): ", bucket%int32( POS0 )
      print "(g0,b64)","bucket%int64() (bit pattern): ", bucket%int64()
   end block
contains
   ! copy an int32 to the first 32 bits of an int64
   elemental integer(int64) function bit32_to_64(i32) result(i64)
      integer(int32), intent(in) :: i32

      integer :: bit
     
      i64 = 0 
      do bit=0,31
          i64 = merge(ibset(i64,bit),ibclr(i64,bit),btest(i32,bit))
      end do
   end function
end
C:\temp>gfortran -Wall p.f90 -o p.exe

C:\temp>p.exe
 Block 1: using copy assignment with a type conversion function
i32  = -123456
i64  = 4294843840
i32 (bit pattern): 11111111111111100001110111000000
i64 (bit pattern):                                 11111111111111100001110111000000

 Block 2: using a custom type
bucket%int32( POS0 )  = -123456
bucket%int64()  = 4294843840
bucket%int32( POS0 ) (bit pattern): 11111111111111100001110111000000
bucket%int64() (bit pattern):                                 11111111111111100001110111000000

C:\temp>
1 Like

I agree with your observations, a derived type would be spot on if Fortran allowed an access operator like discussed here, that apparently not many care about…

1 Like

I notice in the bit32_to_64() function that there is a loop over the individual bits. Is that really the best that can be done with fortran? I would expect that to be a factor of 50 to 100 times slower than a solution that copies all the bits at once in a single instruction.

Regarding whether I should be using int8, int32, or int64 arrays to store the bits, that is partly determined by the legacy api. We have historically used either real*8 or integer*8 arrays to store the bits, depending on what was supported by the compilers of that time. We routinely used type punnng through dummy argument lists to switch back and forth as necessary. Some machines, such as CRAY, only supported 64-bit words, while others required address alignment of real*8 variables and arrays, so using real*8 or integer*8 arrays to store the bits solved all those problems regarding portability.

Now with modern fortran, I’m rewriting some of this low-level code to use just standard declarations and standard intrinsic operators.

The same post referenced with the quoted comment above also shows one other option that does not involve a loop, why are you not taking a look at that?

I only included the bit32_to_64() function, since it was mentioned upthread, for side-by-side comparison with the other option.

Yes, I did notice that. I’m already doing something similar with c_loc() and c_f_pointer() to simulate the legacy type punning, so I might consider that approach for the bit moving problem too. However, with this approach, there is still the big/little-endian addressing problem that arises when changing KINDs, right?

Well, if you have a big-endian machine, try out the code and adapt it as needed. At this point, it ain’t a discourse, damn it - it’s just do it!

Here, going by IBM’s write-up with

const int i = 1;
#define is_bigendian() ( (∗(char∗)&i) == 0 )

the following should work in Fortran:

   logical, parameter :: IS_BIG_ENDIAN = iachar( c=transfer(source=1,mold="a") ) == 0

So then the above bitbucket approach can likely be as follows instead:

! File p.f90
module bitbucket_m
   
   use, intrinsic :: iso_c_binding, only : I1 => c_int8_t, I4 => c_int32_t, I8 => c_int64_t, c_loc, &
      c_f_pointer
      
   private

   type, public :: bitbucket_t
   ! Field of 8-bit integers
      private
      integer(I1) :: m_fields(8) = 0_i1
   contains
      private
      procedure, pass(this) :: zap_fields
      procedure, pass(this), public :: int32 => get_int32
      procedure, pass(this), public :: int64 => get_int64
      generic, public :: zap => zap_fields
   end type bitbucket_t

   ! Ref: https://developer.ibm.com/articles/au-endianc/
   logical, parameter :: IS_BIG_ENDIAN = iachar( c=transfer(source=1,mold="a") ) == 0
   integer, parameter :: POS = merge( 3, 1, IS_BIG_ENDIAN )

contains

   subroutine zap_fields( this, idx )
      ! Argument list
      class(bitbucket_t), intent(inout) :: this
      integer, intent(in)               :: idx(:)
      ! Local variable
      if ( size(idx) > size(this%m_fields) ) error stop 
      if ( any(idx > size(this%m_fields)) ) error stop 
      this%m_fields( idx ) = 0_i1
      return
   end subroutine zap_fields
   
   function get_int32( this ) result(pval)
      ! Argument list
      class(bitbucket_t), intent(in), target :: this
      ! Function result
      integer(I4), pointer :: pval
      call c_f_pointer( cptr=c_loc(this%m_fields(POS)), fptr=pval )
      return
   end function get_int32
   
   function get_int64( this ) result(pval)
      ! Argument list
      class(bitbucket_t), intent(in), target :: this
      ! Function result
      integer(I8), pointer :: pval
      call c_f_pointer( cptr=c_loc(this%m_fields), fptr=pval )
      return
   end function get_int64
   
end module
   use, intrinsic :: iso_fortran_env, only : int32, int64
   block 
      integer(int32) :: i32
      integer(int64) :: i64
      print *, "Block 1: using copy assignment with a type conversion function"
      i32 = -123456
      i64 = bit32_to_64( i32 )
      print "(*(g0))", "i32  = ", i32
      print "(*(g0))", "i64  = ", i64
      print "(g0,32x,b0)", "i32 (bit pattern): ", i32
      print "(g0,b64)","i64 (bit pattern): ", i64
   end block
   print *
   block 
      use bitbucket_m, only : bitbucket_t
      type(bitbucket_t) :: bucket
      print *, "Block 2: using a custom type"
      bucket%int32() = -123456
      print "(*(g0))", "bucket%int32()  = ", bucket%int32()
      print "(*(g0))", "bucket%int64()  = ", bucket%int64()
      print "(g0,32x,b0)", "bucket%int32() (bit pattern): ", bucket%int32()
      print "(g0,b64)","bucket%int64() (bit pattern): ", bucket%int64()
   end block
contains
   ! copy an int32 to the first 32 bits of an int64
   elemental integer(int64) function bit32_to_64(i32) result(i64)
      integer(int32), intent(in) :: i32

      integer :: bit
     
      i64 = 0 
      do bit=0,31
          i64 = merge(ibset(i64,bit),ibclr(i64,bit),btest(i32,bit))
      end do
   end function
end

On a little-endian processor, the above yields:

C:\temp>p.exe
 Block 1: using copy assignment with a type conversion function
i32  = -123456
i64  = 4294843840
i32 (bit pattern):                                 11111111111111100001110111000000
i64 (bit pattern):                                 11111111111111100001110111000000

 Block 2: using a custom type
bucket%int32()  = -123456
bucket%int64()  = 4294843840
bucket%int32() (bit pattern):                                 11111111111111100001110111000000
bucket%int64() (bit pattern):                                 11111111111111100001110111000000

C:\temp>
1 Like

Re: “The solution proposed upstream with Fortran OOP feature is overkill to say the least” -

Not in the least. The solution proposal using the custom type (say bitbucket_t) once tailored to the needs of OP and used properly will help as follows:

  • no need for unpacking and the consumer can avoid a duplicate copy instruction,
  • is all in Fortran,
  • no direct need for bit processing
  • can be scaled to yield significantly greater performance if a consumer need to work with large arrays of integers.

Basically the code can be as simple as:

   use bitbucket_m, only : bitbucket_t
   ..
   type(bitbucket_t) :: bucket 
   ..
   bucket%hi() = int(z'01020304', int32)  ! store the source integer
   bucket%lo() = int(z'05060708', int32)  ! store the source integer
   ..
   ! consumer the high and low pairs
   write(*,'(2Z8.8)') bucket%hi(), bucket%lo()

So the point here about a custom type and tailoring it to the needs is that the solution proposal can easily be adapted to be generic. Then if OP wants to store n (likely n=2) X-bit (likely X=32, but possibly other kinds also) integers in a bucket of n*X bits, it can be designed for that. There won’t be any waste of storage, what was shown upthread with storing only 32-bir integer in a 64-bit bucket was just a quick illustration that was only focused on a specific aspect of OP’s question in the original post and that was even though the questions in the first two posts by OP with MVBITS didn’t seem all that relevant with the other possible options with packing

Parameterized derived type (PDT) facility also can yield further convenience here toward the custom type (say bitbucket_t), too bad gfortran doesn’t support it whereas Intel Fortran and NAG do.

The big advantage of the custom type can be that none of the processor-dependent aspects with sign bit, etc. need apply, the solution can yield internal consistency and the same code should work with different OS and compilers. What one places in a bucket is exactly what one takes out, that is the basic idea.

In fact, if one considers a broader, generic need with different KINDs of integers as alluded by OP in a subsequent post, the suggestion to use C interop with much code in C using bit manipulation is indeed what is an overkill.

Duh!, bit32_to_64 is not part of the custom type approach with bitbucket_t suggested upthread, the one being termed “Fortran OOP solution proposal”.

bit32_to_64 was mentioned upthread earlier and a variant was indicated by @RonShepard with a question along the lines of whether that was the best Fortran had to offer! So after the Thanksgiving festivities and a heavy stomach while family got going with social media and TV and shopping plans, I spent a few minutes thinking of alternative options using Fortran and came up with the bitbucket_t option rather quickly.

And I only included the bit32_to_64 function in my post as a side-by-side comparison of the results and also to illustrate to those who can read that hey, such bit fiddling can have a performance impact whereas the other one that I showed with custom type and I listed under Block 2 in the example calling program will hardly take any extra cycles and yet give the same results.

There is general bias in certain circles that anything with OO is an overkill and that’s the only basis here too

With a generic need that any actual practitioner may have in mind in terms of the packing of integers of different KINDs in a “container” with convenient storage and retrieval at little cost, the option I show above with bitbucket_t or some such variant thereof is a better way to go. The C interop with bit fiddling with “<<” and union of unsigned and signed types of fixed sizes are all a total overkill.

Why would it not work for negative values? int(i32,kind=int64) replicates the MSBit of i32 as a sign bit but this only affect the upper half of i64, so when using frompos=0 and size=32 in mvbits, only the original, unaffected bits are copied.

program m
  use iso_fortran_env, only : i4=>int32, i8=>int64
  implicit none
  integer(i4) :: i32
  integer(i8) :: i64, i64a

  i32=int(z'ff')
  i64=0
  call mvbits(int(i32,kind=i8),0,32,i64,0)
  print *, i64
  i32=-1
  i64=0
  call mvbits(int(i32,kind=i8),0,32,i64,0)
  print *, i64
end program m
!                    255     = 0x00000000000000FF
!             4294967295     = 0x00000000FFFFFFFF

I would use equivalence anyway but I understand that you want to avoid obsolescent features. That’s how the progress looks like :slight_smile:

The left-hand side pointer expression does make this code a bit surprising. It took me a while to realize what was happening. I bet I’m not the only one to find an expression on the left-hand side surprising. The fact you need to use C interoperability functions c_loc and c_f_pointer to subvert the type system is also not the “cleanest” example of Fortran if you ask me. In some way the packing and unpacking in C captures the intent better. In general I agree about using an abstraction, especially if the bucket needs to be larger than what a scalar integer type can offer.

At this link you can take a look at both solutions and their assembly: Compiler Explorer (the %hi() and %lo() routines I added assume little endian).

To me, the c_loc and c_f_pointer approach is about the same level of clarity as my old pre-f90 versions of the code that used type punning and equivalence. And like those old approaches, it still all sits above the big/little-endian mess, it doesn’t really avoid it.

I think it is correct to say that there is no straightforward approach in either fortran or C that avoids all of this mess. If mvbits() where generalized a little so that

mvbits( i32, 0, 32, i64, 0 )

would work, then I think that would be the clearest and simplest solution to all of this. That says explicitly where the bits are coming from, with no endian ambiguity, and where they are going to, again with no endian ambiguity, and there is no unnecessary explicit conversions with either int() or transfer() that the programmer must do. Of course, behind the curtain, the compiler might well do something like that to perform the operation, but that would be fine as long as it eliminates the endian ambiguity and any signed integer complications.

When you read the description of mvbits() in the standard, it goes to some length to say that the pos+len arguments must fit within the number of bits of the integer kinds of the source and destination arguments. That of course is a reasonable requirement. But then it also requires that the integer kinds of the two arguments must be the same, which seems like an unnecessary afterthought. Is there any chance to change the standard to remove that seemingly unnecessary restriction to the mvbits() intrinsic?

2 Likes

The notion of type subversion is itself quite unclean in languages such as Fortran and others like it (C would fit here) that generally strive rather hard for type-kind (to borrow a Fortran standard term) compatibility in the semantics. Yet that is what OP seeks. In the context then to bring any consideration of “cleanliness” in a solution for the task is without much merit.

What I suggest is within the spirit of the standard and it should be seen as a better option than looping over the bits and likely stumbling over the endian-ness. c_loc is specified in the standard to yield the “C address” of an object in Fortran, the only requirement is that it have the TARGET or the POINTER attribute, the object is not required to be interoperable. And c_f_pointer is specified to take a “C address” and complete a “pointer assignment” with a pointer object. Taking advantage of the facility in Fortran is only par for the course, hence my example upthread.

Not sure why the “surprising” aspect should be any consideration. It has been over a dozen years since the standard introduced such a facility precisely for applications like these i.e., indexer-based access to “data”. Practitioners should now consider getting on with it.