Proper Way to Represent C `unsigned char`?

It was pointed out here Tsoding on Fortran - #6 by HugoMVale that the popular library raylib (GitHub - raysan5/raylib: A simple and easy-to-use library to enjoy videogames programming) does not have bindings for Fortran. This simply will not do. It was easy enough to get a graphics window to pop up, but I am stuck feeling that this is going to end up being pretty hacky without a clearer idea of how to properly represent a few things that C has and Fortran doesn’t.

First, this structure: https://github.com/raysan5/raylib/blob/master/src/raylib.h#L235

// Color, 4 components, R8G8B8A8 (32bit)
typedef struct Color {
    unsigned char r;        // Color red value
    unsigned char g;        // Color green value
    unsigned char b;        // Color blue value
    unsigned char a;        // Color alpha value
} Color;

At the moment, I am using this derived-type color and accompanying function to set a color based on Fortran default integers:

    type, bind(C) :: color
        integer(c_signed_char) :: r = int(0, c_signed_char) 
        integer(c_signed_char) :: g = int(0, c_signed_char) 
        integer(c_signed_char) :: b = int(0, c_signed_char) 
        integer(c_signed_char) :: a = int(0, c_signed_char) 
    end type color

        pure function dfray_color(r, g, b, a) result(color_out)
            integer, intent(in) :: r, g, b, a
            type(color) :: color_out
            color_out%r = int(r, c_signed_char)
            color_out%g = int(g, c_signed_char)
            color_out%b = int(b, c_signed_char)
            color_out%a = int(a, c_signed_char)
        end function dfray_color

This seems… Non-ideal to say the least. I have not found a nice way to allow a structure constructor to set this up given values 0-255. For whatever reason, compilers are perfectly fine with the function dfray_color, but complain about integer overflow if the same strategy is used in a structure constructor.

I have also considered faking rgba to be a single integer(c_int32_t) and using mvbits, but that can’t be used in the structure constructor either because it is a subroutine.

I would like to be able to use the constructor so that I can define the single derived type and also the following constants (type(color), parameter :: {name} I suppose) for easy use (raylib - cheatsheet)

// Custom raylib color palette for amazing visuals on WHITE background
    #define LIGHTGRAY  (Color){ 200, 200, 200, 255 }   // Light Gray
    #define GRAY       (Color){ 130, 130, 130, 255 }   // Gray
    #define DARKGRAY   (Color){ 80, 80, 80, 255 }      // Dark Gray
    #define YELLOW     (Color){ 253, 249, 0, 255 }     // Yellow
    #define GOLD       (Color){ 255, 203, 0, 255 }     // Gold
    #define ORANGE     (Color){ 255, 161, 0, 255 }     // Orange
    #define PINK       (Color){ 255, 109, 194, 255 }   // Pink
    #define RED        (Color){ 230, 41, 55, 255 }     // Red
    #define MAROON     (Color){ 190, 33, 55, 255 }     // Maroon
    #define GREEN      (Color){ 0, 228, 48, 255 }      // Green
    #define LIME       (Color){ 0, 158, 47, 255 }      // Lime
    #define DARKGREEN  (Color){ 0, 117, 44, 255 }      // Dark Green
    #define SKYBLUE    (Color){ 102, 191, 255, 255 }   // Sky Blue
    #define BLUE       (Color){ 0, 121, 241, 255 }     // Blue
    #define DARKBLUE   (Color){ 0, 82, 172, 255 }      // Dark Blue
    #define PURPLE     (Color){ 200, 122, 255, 255 }   // Purple
    #define VIOLET     (Color){ 135, 60, 190, 255 }    // Violet
    #define DARKPURPLE (Color){ 112, 31, 126, 255 }    // Dark Purple
    #define BEIGE      (Color){ 211, 176, 131, 255 }   // Beige
    #define BROWN      (Color){ 127, 106, 79, 255 }    // Brown
    #define DARKBROWN  (Color){ 76, 63, 47, 255 }      // Dark Brown

    #define WHITE      (Color){ 255, 255, 255, 255 }   // White
    #define BLACK      (Color){ 0, 0, 0, 255 }         // Black
    #define BLANK      (Color){ 0, 0, 0, 0 }           // Blank (Transparent)
    #define MAGENTA    (Color){ 255, 0, 255, 255 }     // Magenta
    #define RAYWHITE   (Color){ 245, 245, 245, 255 }   // My own White (raylib logo)
1 Like

I have written a C library, that is able to take a “longer” integer like int16(so the range between 0 and 255 can be covered) and transforms it to a bit-pattern corresponding to the unsigned char type.

There are, however, some drawbacks

  • really bad performance (i have not put an extreme amount of work into it)
  • assumes integers are two’s complement which is, to my knowledge, not guaranteed by the
    the Fortran standard
  • not extremely well tested

I would also be perfectly fine if you copied just the parts you need and integrate into the code (no credits required, but of course there is no guarantee).

2 Likes

There is also the way, to use the INT intrinsic.

program unsigned_char
  use iso_fortran_env
  implicit none
  integer(int16):: value_16bit = 255_int16
  integer(int8) :: value_8bit
  value_8bit = int(value_16bit, kind=int8)
  print*, value_8bit
end program unsigned_char

Here the result is
-1 which corresponds to the bit pattern of 255 for unsigned char (assuming two’s compliment is used)…

I have tested it with gfortran, ifort and flang for Linux-x64 as well as with gfortran
for Windows.

I do not think that the Fortran standard guarantees it to work, but maybe it
is “good enough” for your purposes (i.e. the set of combinations of architectures,
compilers and operating systems you plan to support).

This solution is much faster than my library, but may lead to “surprising” code
if someone not aware of this behavior has to maintain it.

When I raised the idea of making bindings for raylib, I must admit that somewhere in the back of my mind I was thinking of what @vmagnin has done for gtk and cairo.

As far as I recall, he developped a python wrapper to automate most of the process. Besides the wrapper itself, which might be quite useful to reduce the number of repetitive manual operations, perhaps he has also encountered the same situation (since gkt and cairo are graphical libs).

2 Likes

In gtk-fortran, pixel drawing is made using the GdkPixbuf library. See for example: https://github.com/vmagnin/gtk-fortran/blob/gtk4/examples/mandelbrot_pixbuf.f90

As the pixel buffers are arrays of guchar (an alias for unsigned char), we used the Fortran character(kind=c_char) type.

Typical code is:

character(kind=c_char), dimension(:), pointer :: pixel
integer(int8) :: red, green, blue
...
          red   = int(min(255, k*2),  int8)
          green = int(min(255, k*5),  int8)
          blue  = int(min(255, k*10), int8)
...
        pixel(p)   = char(red)
        pixel(p+1) = char(green)
        pixel(p+2) = char(blue)
1 Like

Probably there will be no perfect solution while there is no unsigned type in the Fortran standard…

The gtk-fortran examples using pixbuffers have always worked correctly in all tested systems: 32 & 64 bits, Windows, Linux, macOS, FreeBSD, with GFortran and Intel compilers. Also on Raspberry Pi (ARM).

1 Like

If it were allowed, the most natural way to do that in fortran would be

call mvbits( value_16bit, 0, 8, value_8bit, 0 )

However, fortran has an artificial restriction that the FROM and TO arguments of the mvbits() intrinsic must be the same kind, so this is not allowed. This restriction has frustrated programmers for decades.

Another approach using standard fortran intrinsics is

integer(int8) :: i8(2)
...
i8 = transfer( value_16bit, i8 )
value_8bit = i8( I8INDX )

The problem here is that I8INDX is machine dependent. It should be 1 for little-endian machines (the majority these days) and 2 for big-endian machines. It may be defined, for example in a module, as

integer, parameter :: I8INDX = merge( 1, 2, 1_int16 == transfer([1_int8,0_int8],int16))

Of course, as noted above, something like integer(int8), unsigned :: i8u would also result in simple code for these types of things when whole bytes are involved. In the old pre-f90 days, this would have been done typically with nonstandard INTEGER*N declarations and EQUIVALENCE, so the language has made some progress in the last 50 years, but one wonders why it is still so difficult to accomplish even simple tasks like this. The mvbits() solution, if it were allowed, works with arbitrary size bit fields and would not depend on the addressing convention of the processor, so it would eliminate any machine-dependence in the fortran code related to these types of operations.

3 Likes

Well… I can’t say I love it. This is the best way I have found to represent raylib’s color struct in Fortran in a way that can both maintain type checking by using a derived-type and be used for constants (or any other initialization) defined with structure constructor using int(hex, c_int32_t).

module raylib_m
use, intrinsic :: iso_c_binding, only: c_int32_t
implicit none
private

    public :: c_int32_t

!   // Color, 4 components, R8G8B8A8 (32bit)
!   typedef struct Color {
!       unsigned char r;        // Color red value
!       unsigned char g;        // Color green value
!       unsigned char b;        // Color blue value
!       unsigned char a;        // Color alpha value
!   } Color;
    type, public, bind(C) :: color
        integer(c_int32_t) :: rgba = 0_c_int32_t
    end type color
    
    type(color), parameter, public :: lightgray = color(int(z'ffc8c8c8', c_int32_t))

end module raylib_m

program main
use, non_intrinsic :: raylib_m
implicit none

    type(color) :: background_color = color(int(z'ff7e1f70', c_int32_t)) ! Dark Purple

    write(*,'(a,z8)') '            from module, lightgray (Light Gray): ',lightgray%rgba
    write(*,'(a,z8)') 'locally defined, background_color (Dark Purple): ',background_color%rgba

end program main

Looking through the cheatsheet I did not see anything that looked to be directly modifying the r, g, b, or a fields of a color, and raylib includes its own ColorToInt to convert the struct to a 32-bit integer.

I will definitely be adding Fortran routines to return a color from 4 Fortran integers, as well as mutate a given color to new value based on the same inputs.

It’s actually kind of crazy that there is any argument against Fortran having unsigned integers. I will have to go review those threads, because it is rather bothersome from a language that supposedly wants such great interoperability with C.

To the point about a python script auto-generating the bindings module - I agree 100% that is an objectively superior solution to hand writing it. Perhaps in the future that will be the direction of this project.

1 Like

This approach also depends on the machine’s byte addressing convention, doesn’t it? On a little-endian machine r will be the low-order bits of rgba, while on a big-endian machine r will be the high-order bits of rgba. This may or may not be important, but it does affect the portability aspects of how the fortran and C codes interact.

That’s fair. Does integer(c_signed_char) :: r = int(255, c_signed_char) depend on anything machine or compiler specific? That was the only other way I could think to do this and keep anything in common/standard with how raylib works/is used. The reason I prefer the single value (for now) is that it allows me to use the structure constructor to make a color with minimal additional work: type(color) :: background_color = color(int(z'ff7e1f70', c_int32_t)), as hex codes like that are fairly common for getting RGB colors as well. Do you have to reverse the bytes of the hex string any online calculator will tell you? Yes, and that is annoying, but alas.

Also, I would be surprised to learn there were any big-endian machines that could also use OpenGL for graphics. I tried a quick Google search, which leads me to PowerPC era Macs. I believe Fortran bindings to raylib will be the least of their worries.

Note the standard states with a conforming Fortran processor and a companion processor, “This has the potentially surprising side effect that the C type unsigned char is interoperable with the type integer with a kind type parameter of C_SIGNED_CHAR”

The issue here is more of the Fortran practitioner making inadvertent mistakes due to the lack of explicit “creature comfort” in Fortran. It may help to work with hexadecimal units in Fortran.

You can test with the processor of choice:

#include <stdio.h>
  
unsigned char c;
void Fsub();

int main()
{
    c = 0xAE;
    printf("In C main: initially c = %x\n", c);
    Fsub();
    printf("Back in C main: c = %u\n", c);
    return 0;
}
module m
   use, intrinsic :: iso_c_binding, only : c_signed_char, c_int32_t
   integer(c_signed_char), bind(C, name="c") :: c
   integer(c_int32_t), parameter :: BLACK = int(z"FF", kind=kind(BLACK))
contains
   subroutine Fsub() bind(C, name="Fsub")
      print *, "In Fsub:"
      print "(*(g0))", "     c: ", c
      print "(g0,z0)", "c(hex): ", c
      c = BLACK
      ! Or the pedantic option 
      !c = int(BLACK, kind=kind(c) )
   end subroutine 
end module 
C:\temp>cl /c /W3 /EHsc c.c
Microsoft (R) C/C++ Optimizing Compiler Version 19.34.31937 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

c.c

C:\temp>ifort /c /free /standard-semantics m.f
Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.9.0 Build 20230302_000000
Copyright (C) 1985-2023 Intel Corporation.  All rights reserved.


C:\temp>link c.obj m.obj /subsystem:console /out:c.exe
Microsoft (R) Incremental Linker Version 14.34.31937.0
Copyright (C) Microsoft Corporation.  All rights reserved.


C:\temp>c.exe
In C main: initially c = ae
 In Fsub:
     c: -82
c(hex): AE
Back in C main: c = 255
2 Likes

I’m sorry, I do not understand what you are getting at here. I know that integer(c_signed_char) is the intended interoperable type and kind for C’s unsigned char. The problem is that I cannot say integer(c_signed_char) :: r = 255 without the compiler complaining. As shown here, that is how raylib defines its provided colors.

It is easier to rather define the color struct on the Fortran side as a single 32-bit integer and provide some routines to return or set a color with simple default Fortran integers. For instance pure function make_color(r, g, b, a) result(color) should accept default integers for r, g, b, and a, then return a type(color). Similarly, something like pure subroutine set_color(color, r, g, b, a) would take in a type(color), along with default integer r, g, b, and a, then mutate color to match the inputs. Using a single integer(c_int32_t) :: rgba field also allows nicer use of the structure constructor with hex, as mentioned: type(color) :: red = color(int(z'FF0000FF', c_int32_t)).

That you can ignore / suppress the compiler complaint, it’s simply being overpedantic in this case.

gfortran is very clear about that, and suggests -fno-range-check, but that is going to apply globally, which could be undesirable. At least, that was my understanding.

I think in current fortran, BOZ constants are sometimes treated “as if” transfer() is used to set the bits. So while r=255 overflows, would r=z'FF' or r=b'11111111' work correctly? Or is something more elaborate required, such as r=int(z'FF',int8) or r=int(b'11111111',int8)?

You can’t use BOZ without some wrapper has been my experience:

program main
implicit none

    integer :: r = z'ff'

    write(*,'(a,z8)') 'r: ',r

end program main

Yields the following error with gfortran:

main.f90:4:18:

    4 |     integer :: r = z'ff'
      |                  1
Error: BOZ literal constant at (1) is neither a data-stmt-constant nor an actual argument to INT, REAL, DBLE, or CMPLX intrinsic function [see ‘-fno-allow-invalid-boz’]

Inspired by this thread, I tried to write a proof of concept for a type(uint8) (just to challenge myself). As some here suggested, it uses transfer and therefore only works on little-endian machines. But maybe there is a preprocessor directive or something else to fix this, I don’t know.
Here’s the code:

module uint_m
module uint_m
use iso_fortran_env, only: int8, int32
implicit none

type uint8
    integer(int8) :: i
end type uint8

interface uint8
    module procedure constructor
end interface uint8

interface int
    module procedure uint8_to_int
end interface

interface operator(+)
    module procedure uint8_plus_uint8
end interface

contains

    elemental function constructor(i) result(ui)
        integer(int32), intent(in) :: i
        type(uint8) :: ui
        integer(int8) :: tmp(4)

        tmp = transfer(i, tmp)
        ui%i = tmp(1)  ! only for little-endian
    end function constructor

    elemental function uint8_plus_uint8(lhs, rhs) result(res)
        type(uint8), intent(in) :: lhs
        type(uint8), intent(in) :: rhs
        type(uint8) :: res

        res%i = lhs%i + rhs%i
    end function uint8_plus_uint8

    elemental function uint8_to_int(ui) result(i)
        type(uint8), intent(in) :: ui
        integer(int32) :: i

        i = transfer(ui%i, i)  ! Edit: this line is unsafe, see comment from RonShepard
    end function uint8_to_int

end module uint_m

use uint_m
print *, "127+1", int(uint8(127) + uint8(1))
print *, "64+64", int(uint8(64) + uint8(64))
print *, "127+127", int(uint8(127) + uint8(127))
print *, "128+127", int(uint8(128) + uint8(127))
print *, "255", int(uint8(255))
print "(A, X, Z0)", "255", uint8(255)
end

Maybe it helps. :person_shrugging:

Edit: I forgot to mention, that the bit representation of the uint8 type looks exactly like what you are looking for. I’ve updated the code with a print which demonstrates this.

1 Like

“what happens in the library, stays in the library”…

Because you want to write an interface to a full library, I believe you may want to take inspiration from the C++ interface and do something similar in Fortran.

OK, because of the unsigned char issue, you can’t just use the C type directly like C++ does, but it’s not a big deal. Instead, you can do basically nothing with bind(C) types in Fortran. So, you could just use the bind(C) type to interoperate with the library functions, but work at the Fortran level with a full-featured derived type, with type-bound functions, and only use the C type for setting/getting the information from your Fortran derived type to/from the library:

type, bind(C) :: c_color
   integer(c_signed_char) :: r,g,b,a
end type c_color

type, public :: Color
   integer(int16) :: r = 0, g = 0, b = 0, a = 255
   contains
      procedure :: to_c
      procedure :: from_c
      ! and many more
end type Color

elemental subroutine from_c(this,that)
   class(Color), intent(inout) :: this
   type(c_color), intent(in) :: that ! use _value_ if you want
   ! find a way to convert [-128:127] -> [0:255]
   this%r = int(that%r, kind(this%r))  + 128 
   this%g = int(that%g, kind(this%r)) + 128
   this%b = int(that%b, kind(this%r)) + 128
   this%a = int(that%a, kind(this%r)) + 128
end subroutine from_c

! etc...

I would bet that copyin/copyout with Fortran datatypes would be faster than implementing user-defined operators for a new “unsigned int” type anyways.

There are two potential problems with this function. From the gfortran documentation:

If the bitwise representation of the result is longer than that of SOURCE, then the leading bits of the result correspond to those of SOURCE and any trailing bits are filled arbitrarily.

The first problem is related to the “leading bits”, which in this context are undefined. In practice, they are the low-order bits on little-endian machines and they are the high-order bits on big-endian machines.

Then the second problem in this case is that SOURCE is 8 bits and the result is 32 bits. So the remaining 24 trailing bits (wherever they are) are undefined. This code assumes they are set to zero, but that is not specified by either the standard or the compiler-specific documentation.

Both of these issues have been addressed already in this thread. In this case involving 8-bit and 32-bit integers, you need to specify an array of length 4 in the transfer() step. Something like:

integer, parameter :: I8INDX = merge( 1, 4, 1_int32 == transfer([1_int8,0_int8,0_int8,0_int8],1_int32))
integer(int8) :: i8(4)
i8(:) = 0_int8   ! set all bits to zero.
i8(I8INDX) = ui%i   ! set the leading bits.
i = transfer( i8, i )

should work in a reasonably portable way. However, even this is not strictly guaranteed to work because in principle there could be machines where I8INDX needs to be 2 or 3. The bulletproof code would require all four possibilities to be tested in the parameter expression, not just one of them as above, but that code would be unclear to a human reader, so some compromise might be appropriate.

This mapping of integer values would not match the C values. In twos-complement arithmetic, only the negative values should be shifted (by 256, not 128), the nonnegative values should remain the same. This detail would complicate the code significantly.

1 Like