Binding C union in Fortran

Hi,

I have some C structures meant for interop with Fortran that contains unions (tagged ones). At the moment, as we are on Intel compiler we have access to the UNION extension but it is not standard and is only recognised by tools that implement them.
I saw there was equivalence and transfer but it says this is deprecated, very surprisingly.

Am I left with this compiler extension or is there something else that could work to allow the bindings?

Thanks
Regards

1 Like

Something which I imagine would work is

  1. Hide the whole struct and expose accessor functions that act on a pointer

Unions fall outside of the interoperable entities, so the classic approach to solve this is using indirection with an opaque pointer.


Neither are deprecated, however equivalence has been declared “obsolescent”. AFAICR, equivalence is limited to intrinsic numeric and character types, and sequence types, but not user-defined derived types.

I guess transfer could work, as long as you know the width of the union type. But it would still be a loophole:

The loophole lets the programmer breach the type-checking by the compiler. It is a way to say: “Don’t interfere, as I am smarter than the rules”.

Also, welcome to Fortran Discourse!

2 Likes

WG5 “conditionally accepted” a form of unions as proposal DIN02 for Fortran 202Y (see j3-fortran.org/doc/year/24/24-117.txt) I see that J3 was supposed to discuss this at the October 2024 meeting, but I don’t think we did. I’ll bring it up again.

2 Likes

J3 did discuss it this week and voted to recommend to WG5 that unions be moved to the Approved list, A WG5 letter ballot on this will be going out soon. Some members were uncomfortable with introducing a new form of equivalence association to the language, given that EQUIVALENCE is obsolescent, but the argument that lacking unions in C interoperability was a major hole won the day.

2 Likes

If it is such a major hole why wasn’t it a work-item in F2003? I think unions don’t appear very often as dummy arguments in C functions.

Some notable members of the C++ committee have several guidelines against the use of unions:

In the rare cases when unions are warranted there are guidelines for their safe use: C++ Core Guidelines

In C++17 they introduced std::variant to overcome the type-safety issue.


Also Jens Gustedt, member of the C committee, co-editor of the C standard and author of textbook of Modern C (covering C23) says the following:

Unions: These overlay items of different base types in the same memory location. Unions require a deeper understanding of C’s memory model and are not of much use in a programmer’s everyday life. (emphasis added)

I find it interesting that J3 thinks interoperability of unions is warranted.

1 Like

I guess the question becomes why not make the union also a regular Fortran construct, why does it have to be bind(c) only?

I am guessing the answer is that we don’t actually want to use union in Fortran, supporting @ivanpribec’s post and quotes. But we still want to interoperate with C.

Note: most compilers (including LFortran) already have support for unions internally. So actually exposing it to the front-end language is not difficult.

1 Like

Thanks for all the answers and input!

I come from the C/C++ world, so in my opinion, tagged unions (and their std::variant C++ counterpart as mentioned) are a very powerful and basic construct on how you model your data.

How would one do it at the moment in standard Fortran? If you have two structures (trivial or derived types) that are mutually exclusive (you cannot have both existing at the same time), would you have to store both aside one another and “waste” the extra space?

If you really need to save memory, you can always make mutual exclusive attributes allocatable.

You could use class(*):

type :: t1
end type

type :: t2
end type

type :: t3
    integer :: i = 0
    real :: r = 0
    class(*), allocatable :: met
end type

type(t3) :: a, b, c, d

print*,'checking a:'
call check(a)

allocate (b%met, MOLD = t1())
print*,'checking b:'
call check(b)

allocate (c%met, MOLD = t2())
print*,'checking c:'
call check(c)

allocate (d%met, SOURCE = 4)
print*,'checking d:'
call check(d)

contains
    subroutine check(x)
        type(t3), intent(in) :: x
        if (.not. allocated(x%met)) then
            print*,achar(9)//'there''s no type'
            return
        endif

        select type (met => x%met)
        type is (t1)
            print*,achar(9)//'type is t1'
        type is (t2)
            print*,achar(9)//'type is t2'
        class default
            print*,achar(9)//'unknown type'
        end select
    end subroutine
end

That seems to work with ifort|ifx|gfortran|flang-new.

1 Like

EQUIVALENCE was the traditional method of overlaying memory. But it doesn’t work with dynamic memory management (e.g., allocatables, pointers, etc.)

If the desired union is at the end of the base type, one could could use type extension (e.g., inheritance) and classes. @jwmwalrus just posted an example above.

You can also use C Interop features to do some aspects. Where things get tricky is if the union is in some interior subset of a larger structure.

1 Like

I’ve never used C unions or std::variant in C++, but I imagined that the overlaid fields in a struct (union) share the same memory on the stack (when the struct variable is declared as a local variable). In that sense, the use of allocatables is a bit different from the C union? (though I guess C does not specify any stack vs heap explicitly).

1 Like

So it would be using a void * for complete type erasure, and then some runtime type info to get back the actual type hold by the pointer, eh
I guess that would be one way of doing it, though not mine at all ^^

It would also lack the constraint set of option that union provides, as we could potentially pass everything in the pointer
This is outside of C interop talk, but to stay with this idea, we could also use runtime polymorphism and class inheritance could we not? I am not aware of the extent of the OOP features in fortran yet

You can place unions or std::variant on the heap. A small example can be found here: Japanese Subgroup GENERIC proposal - #20 by ivanpribec

struct circle { float radius = 1.0; };
struct square { float width = 1.0; };

using shape = std::variant<circle,square>;

int main() {
    std::vector<shape> my_shapes;
    // ...

The vector container uses heap-allocated memory.

1 Like

Unions and variants are both on the stack, no dynamic allocation occurs, it basically consists of a storage like an array of bytes of the size of the biggest member of the union/variant.

Edit: but if you want to dynamically allocate them, you can, it is up to what you want to do with them

1 Like

Thanks for both your info! (and yes, I was wondering about a local variable like shape s; rather than vector<shape> .... I also tried to emulate C unions with Fortran pointers in a derived type, but it also “wastes” 8 bytes etc for the pointer itself (and also the possible performance loss of indirection). I guess type punning will be possible in other ways (with c_f_pointer() or use of procedures with implicit interface), but I am afraid they may not be “standard” fortran anymore (once type punning is attempted…)

FWIW, swig-fortran can handle structure with union. It create all the c wrapper that is required. Usually you still need to write a bit of Fortran on top because swig wraps all c members into type(c_ptr) and you access the actual values through the generated type-bound procedures.

It’s tied to C interoperability because 1) C supports unions in structs and Fortran doesn’t, 2) This rules out difficult case such as POINTER or ALLOCATABLE members, 3) there are existing APIs full of unions that are difficult to access from Fortran.

One of the committee members pointed out that while you can fudge it with separate derived type declarations for the union members, you can’t get the automatic “largest size member” that this feature would provide.

We don’t want to add unions generally to the language, but it has been an often-requested feature and I felt that this was the right way to do it. Please read the linked paper for more details.

2 Likes

Slightly offtopic, but…

Recently, a GNU glibc upgrade broke a cli application I use often —I wrote it in Fortran, and it uses the regex.h API.

I noticed that the issue was that the regex_t struct now uses bit fields (a C23 feature) to save 48 bytes… and, of course, the glibc guys have never cared about backwards compatibility :slightly_frowning_face:.

So, is there a chance that the Committee revisits BITS (at least for interoperability with C) in the future or is that door closed for good?

2 Likes

We did revisit BITS for F2023 and chose instead to enhance support for bit operations without creating a new datatype. I discuss this in part in Doctor Fortran in “We’re All BOZos on This Bus” - Doctor Fortran

I don’t see us going back to BITS.

2 Likes

There are some problems with class(*), allocatable variables, see:

That’s a close description to what it is.


There was a thread recently, which provides a way of emulating a union: Data-pointer-object and unlimited polymorphic target. The solution involves sequence types and class(*), pointer variables. There is still some uncertainty about the conformity of such usage.

Plain union (potentially uncorforming)

program plain_union

use, intrinsic :: iso_fortran_env
implicit none

logical, parameter :: little_endian = ( 1 == transfer( [1_int8, 0_int8], 0_int16) )
integer, parameter :: dp = kind(1.0d0)

type :: t_double ! 8 bytes
   sequence
   real(dp) :: val
end type
type t_int ! 4 bytes
   sequence
   integer :: val
end type

! The data for the union
type(t_double) :: a

type(t_int), pointer :: a_i
type(t_double), pointer :: a_d

class(*), pointer :: u

print *, "little endian? ", little_endian

! z'3FD5 5555 5555 5555' ~= 1/3
a%val = 1.0_dp/3.0_dp
print *, a%val
print '(Z16)', transfer(a,1_int64)

u => a

a_i => u
print '(Z16)', a_i
a_d => u
print '(Z16)', transfer(a_d,1_int64)

end program

Output:

$ gfortran -Wall union.f90 
$ ./a.out
 little endian?  T
  0.33333333333333331     
3FD5555555555555
        55555555
3FD5555555555555
$ flang-new union.f90 
$ ./a.out
 little endian?  T
 .3333333333333333
3FD5555555555555
        55555555
3FD5555555555555
$ ifort union.f90 
$ ./a.out
 little endian?  T
  0.333333333333333     
3FD5555555555555
        55555555
3FD5555555555555
~> nagfor union.f90 
NAG Fortran Compiler Release 7.2(Shin-Urayasu) Build 7203
[NAG Fortran Compiler normal termination]
~> ./a.out
 little endian?  T
   0.3333333333333333
3FD5555555555555
        55555555
3FD5555555555555

All compilers silently accept the type-punning. The only one that doesn’t is nvfortran which says:

NVFORTRAN-S-0155-Illegal POINTER assignment - type mismatch (/app/example.f90: 36)
NVFORTRAN-S-0155-Illegal POINTER assignment - type mismatch (/app/example.f90: 40)
  0 inform,   0 warnings,   2 severes, 0 fatal for plain_union
Compiler returned: 2

Even if the plain union is not conformant due to the type-punning, the tagged case should be legal.

Tagged union

! tagged_union.f90

module menu

implicit none
private

public :: menu_item
public :: pizza_order, burger_order
public :: print_orders

integer, parameter :: ITEM_PIZZA = 1
integer, parameter :: ITEM_BURGER = 2

! N.b. sequence types are not polymorphic

type :: pizza
    sequence
    character(len=1) :: size ! S, M, L
    integer :: toppings ! Number of toppings
end type

type :: burger
    sequence
    integer :: patty_count
    logical :: cheese
end type

! Since item will contain
type :: menu_item
    private
    integer :: type
    class(*), allocatable :: item
end type

contains

    function pizza_order(size,toppings) result(order)
        character(len=1), intent(in) :: size
        integer, intent(in) :: toppings
        type(menu_item) :: order
        if (size /= 'S' .and. size /= 'M' .and. size /= 'L') error stop "Wrong pizza size"
        if (toppings > 5) error stop "Only up to 5 toppings are available"
        order = menu_item(ITEM_PIZZA,pizza(size=size,toppings=toppings))
    end function

    function burger_order(patty_count,with_cheese) result(order)
        integer, intent(in) :: patty_count
        logical, intent(in) :: with_cheese
        type(menu_item) :: order
        order = menu_item(ITEM_BURGER,burger(patty_count,with_cheese))
    end function

    subroutine print_orders(orders)
        type(menu_item), intent(in), target :: orders(:)
        integer :: i

        do i = 1, size(orders)
        select case(orders(i)%type)
        case(ITEM_PIZZA)
            block
                type(pizza), pointer :: p
                p => orders(i)%item
                print '("Pizza: Size ",A1,", ",I0," extra toppings")', &
                    p%size, p%toppings
            end block
        case(ITEM_BURGER)
            block
                type(burger), pointer :: p
                p => orders(i)%item
                write(*,'("Burger: ",I0," patties, ")',advance='no') &
                    p%patty_count
                if (p%cheese) then
                    write(*,'(A)') "with cheese"
                else
                    write(*,'(A)') "no cheese"
                end if
            end block
        case default
            error stop "Unknown menu item"
        end select
        end do
    end subroutine

end module

program tagged_union

use menu
implicit none

type(menu_item), allocatable :: ord(:)

ord = [ pizza_order(size='M',toppings=3), &
        burger_order(patty_count=2,with_cheese=.false.) ]

call print_orders(ord)

end program

Output:

$ flang-new tagged_union.f90 
$ ./a.out
Pizza: Size M, 3 extra toppings
Burger: 2 patties, no cheese
~> nagfor tagged_union.f90 
NAG Fortran Compiler Release 7.2(Shin-Urayasu) Build 7203
[NAG Fortran Compiler normal termination]
~> ./a.out
Pizza: Size M, 3 extra toppings
Burger: 2 patties, no cheese