Which is the more standard compliant way of writing bytes (octets) in a binary file?
In C there is the unsigned char type that can be used to store bytes. But in Fortran integers are signed. If I use an integer with kind INT8, it should be OK for values between 0 and 127. But if I need to write values in the 128…255 range, problems begin… Moreover, the representation of negative values are processor dependent. Most of the time using two’s complement, but without guarantee.
I have the idea of using MVBITS(): I put the 0…255 value in an INT16, then transfer the 8 last bits in an INT8. Assuming of course the correct Endianness… https://gcc.gnu.org/onlinedocs/gfortran/MVBITS.html
A bit of overthinking here. In memory, an INT8 integer is just a block of 8 bits. Unformatted stream I/O should threat them that way. The signed/unsigned distinction occurs when you perform arithmetic with the values, or convert them to a different KIND of integer.
@interkosmos , @alozada
The TRANSFER proposition is interesting: “Transfer physical representation.” I will make a try: the representation of 255 in Fortran should be 00000000 11111111, or rather 11111111 00000000 on a little endian machine… Will I have 00000000 or 11111111 in my c_int8_t ?
The character solution could work on our machines, but I am not sure it is universal. The standard says:
“The processor defines a collating sequence for the character set of each kind of character. The collating sequence is an isomorphism between the character set and the set of integers{I: 0≤I < N}, where N is the number of characters in the set.”
Yes, 0x81 = 129 is purposely just out of range [-127; +128].
Concerning INT(A, KIND) the Fortran 2018 standard says:
Case (iv):If A is a boz-literal-constant, the value of the result is the value whose bit sequence according to the model in 16.3 is the same as that of A as modified by padding or truncation according to 16.3.3.The interpretation of a bit sequence whose most significant bit is 1 is processor dependent.
and:
16.3.3 Bit sequences as arguments to INT and REAL
1 When a boz-literal-constant is the argument A of the intrinsic function INT or REAL,
•if the length of the sequence of bits specified by A is less than the size in bits of a scalar variable of the same type and kind type parameter as the result, the boz-literal-constant is treated as if it were extended to a length equal to the size in bits of the result by padding on the left with zero bits, and
•if the length of the sequence of bits specified by A is greater than the size in bits of a scalar variable of the same type and kind type parameter as the result, the boz-literal-constant is treated as if it were truncated from the left to a length equal to the size in bits of the result.
I think that A=129 is a positive 16 bits signed integer: 00000000 10000001, and with INT(A, KIND=INT8) is it truncated to the rightmost 8 bits: 10000001. And that’s what I want: I just want to write the octet 10000001 in the file, I don’t care that it can be interpreted and printed as -127 by my processor.
The 16.3 Bit model is the classical binary model used for positive values. (But “the interpretation of a negative integer as a sequence of bits is processor dependent”.)
achar() is defined as result = achar(i [, kind]) (with optional character kind of the result since Fortran 2003). The argument i should be a positve signed integer [0 … 255]. The type integer(kind=1) has the range -127 … 127. You can pass integer(kind=1) to achar() but negative values will be converted to positive (I don’t know if this behaviour is undefined in the language standard).