Bind(c) on derived type definition and/or object declaration

What’s the difference between the bind(c) attribute on the definition of a derived type versus on the declaration of individual objects of that type?

type, bind(c) :: best_type                 ! 1
   integer(C_INT) :: highest_number        ! 2
end type best_type                         ! 3
                                           ! 4
type(best_type), bind(c) :: unobtainium    ! 5

In my test cases, omitting bind(c) from the object declaration on line 5 makes that variable inaccessible to the C portion of my code. As expected.

However, I haven’t seen any obvious effect from omitting bind(c) from the type definition on line 1 while keeping it on line 5. What should happen? Is it just luck that the memory map lined up between the trivial Fortran and C structures?

https://j3-fortran.org/doc/year/24/24-007.pdf, 18.3.4 Interoperability of derived types and C structure types.

The numbered constraints C1801-C1806 mean that the compiler should reject code that violates them (thus helping you write standard-conforming code that has a hope of working as intended), if you include the BIND(C).

1 Like

The Fortran standard does not say how a compiler should represent a derived type in memory, in particular it is allowed to reorder them. The C rules for the structures differ (I don’t think a C compiler is allowed to reorder the members). So, bind(C) in the type definition tells the Fortran compiler to follow the C rules here.

1 Like

Fortran 90/95 supported (and Fortran still supports) using a SEQUENCE statement in a derived-type to tell the compiler that the order of the components in the type specifies a storage sequence for the type components. SEQUENCE allowed derived types to appear in COMMON or EQUIVALENCE statements. Some compilers would also allow you to pass a derived type with a specified SEQUENCE to an equivalent C structure as an actual argument so it had a similar effect as BIND(C). For those unfamiliar with SEQUENCE it would look like this.

type  mytype
  sequence
  real :: a(100)
  integer :: b(50)
end type  

Most F90/F95 programming books of the time said that unsequenced (regular) derived types were to be preferred.

Nonetheless they are different, even if on some compilers they have the same effect (but not on the Intel compiler for instance).

Is this an accurate summary? An object with bind(c) in its declaration would be visible to the linker for C interoperability, but if it lacks bind(c) on its derived type definition its internal memory layout is not guaranteed to be compatible with C.

1 Like

I think the purpose of SEQUENCE was to allow two different derived types, with separate definitions, to be interoperable. The keyword basically required some fixed convention by the compiler for memory layout and padding within the derived types, but without actually defining the details of that convention. Without that keyword, the memory layout of the components might depend on things like compiler options or optimization levels, so the two separate definitions might not necessarily match exactly.

Someone could write a library routine with one derived type, and then it could be called by someone else as an external procedure with an interface block that uses a different derived type with no other information shared between them (e.g. no USE statements). I always thought of this as a carve-out in the language for commercial vendors writing proprietary library code.

I’m sure this is part of it but my copy of Adams et al “Fortran 95 Handbook”
chapt 4, page 82 paragraph 3 says

" However, if a SEQUENCE statement appears inside the type definition, the type is considerred to be of sequence type. In this case, the order of the components specifies a storage sequence for objects of the type such that such objects may appear in COMMON or EQUIVALENCE statements".

Obviously, what you state is also true but I think the original intent was to provide a way for derived types to be referenced in storage association constructs.

1 Like

By the way, I have never used bind(C) on module variables, and it’s unclear to me what it does exactly.

In C, if I’m not wrong, to share a global variable between different source files, one must declare int myvar in one file, and declare extern int myvar in all the other files.

When declaring a Fortran module variable with integer(c_int), bind(C) :: myvar, it is effectively shared with a C source file, whichever it is declared int myvar or extern int myvar.

This is confusing to me… Does it mean that bind(C) on a module variable is neither equivalent to int myvar nor extern int myvar? But then, to what is it equivalent?