-fallow-argument-mismatch

It’s not me trying to offer a roundabout explanation, it’s the Fortran standard itself with section 18.3.1 on interoperable types and how it makes combination of Fortran types and type parameters to allow interoperation in certain circumstances such as nonnegative values of types that are signed integers on the Fortran side and unsigned in C and similarly with unsigned char in C with Fortran integer of C_SIGNED_CHAR that are interoperable as a side effect of the allowance.

But for such allowance, the description for C_F_POINTER didn’t need to carve differences with the source pointer as to whether it’s of interoperable type or not, the restriction could have called for the “same exact type and type parameters” in both the cases.

P. S.> This site doesn’t load on the two Windows 10 OS laptops I have at the moment, anyone else facing the same problem? Actually I’ve been encountering loading issues with this site off and on since the denial of service attack recently on Discourse site (around last week is when it got worse) I’m typing this on my phone and it’s presently inconvenient to put forth any code examples

Not exactly. It’s a bit more nuanced. I’m saying my example is NOT standards conforming (TKR mismatch of arguments), but the storage sequence is defined by the standard such that you can expect it to work anyway. Hence your observation that lots of programmers took advantage of that fact.

My reading of the standard was that it only referred to signed vs unsigned types explicitly. You’d have to read further into the C standard to make inferences about other types (I think). I’m not sure I fully understand the nuances and what is said in the C standard to make the leap to interoperability between types of different sizes.

I would consider EQUIVALENCE requires an understanding of memory addresses.
I think that is the key. To use EQUIVALENCE, you must understand that memory is sequential. What is wrong with this assumption. Array sections are a different complexity that should not exclude the use of EQUIVALENCE in other typical cases. It is not that complex!

With regard to INTEGER types, there are only 4 types : 1-byte, 2-byte, 4-byte and 8-byte on all of the processors I have used in the last 30 years… I am not familiar with why C has more types.

With my limited understanding of C interoperability, I have tried to use c_ptr integers as 8-byte integers on x64, but obtained compiler error messages. Again I solved this with EQUIVALENCE, by recasting the memory address (might have used TRANSFER but they are basically 8 bytes). As someone who is familiar with F77 wrappers, it works for me. (Why should this capability be excluded ?)

use iso_fortran_env, only: integer_kinds
write(*,*) integer_kinds
end program

$ gfortran xxx.f90 && a.out
           1           2           4           8          16

C compilers have typically the same integer types as fortran compilers, plus an unsigned version of each. Fortran is now required to have two integer kinds, a default kind and an extended kind. I think only the C int type is required to be interoperable with a fortran kind. Everything else is beyond those minimal requirements.

Here’s an example you can try, it needs a Fortran 2018 processor though* because the enhanced interoperability facility is required to manipulate the Fortran pointer from C.

#include <stdio.h>
#include <stdint.h>
#include "ISO_Fortran_binding.h"

int64_t i64;

void set_int64(int64_t x) {
   i64 = x;
   return;
}

void get_int8( CFI_cdesc_t* x ) {
   CFI_index_t ext[1];
   ext[0] = 8;
   int irc = CFI_establish(x, &i64, CFI_attribute_pointer, CFI_type_int8_t,
                           (size_t)8, (CFI_rank_t)1, ext);
   return;
}
  • Fortran main
   use iso_c_binding, only: c_int8_t, c_int64_t, c_loc, c_f_pointer
   logical, parameter :: IS_BIG_ENDIAN = iachar( c=transfer(source=1,mold="a") ) == 0

   interface
      subroutine set_int64( x ) bind(C, name="set_int64")
         import :: c_int64_t
         integer(c_int64_t), intent(in), value :: x
      end subroutine
      subroutine get_int8( x ) bind(C, name="get_int8")
         import :: c_int8_t
         integer(c_int8_t), pointer, intent(inout) :: x(:)
      end subroutine
   end interface

   integer(c_int64_t) :: i64
   integer(c_int8_t), pointer :: i8(:)
   integer :: i

   i64 = 1234567890_c_int64_t
   print "(g0,*(b64.64))", "i64: ", i64
   call set_int64( i64 )
   call get_int8( i8 )
   if ( .not. IS_BIG_ENDIAN ) then
      print "(g0,*(b8.8))", "i8:  ", (i8(i), i=ubound(i8,dim=1),lbound(i8,dim=1),-1)
   else
      print "(g0,*(b8.8))", "i8:  ", (i8(i), i=lbound(i8,dim=1),ubound(i8,dim=1),1)
   end if

end
  • Program behavior using Intel Fortran and Microsoft C/C++ as companion processor:
C:\temp>cl /c /W3 /EHsc c.c
Microsoft (R) C/C++ Optimizing Compiler Version 19.33.31630 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

c.c

C:\temp>ifort /c /standard-semantics p.f90
Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.7.0 Build 20220726_000000
Copyright (C) 1985-2022 Intel Corporation.  All rights reserved.


C:\temp>link p.obj c.obj /subsystem:console /out:p.exe
Microsoft (R) Incremental Linker Version 14.33.31630.0
Copyright (C) Microsoft Corporation.  All rights reserved.


C:\temp>p.exe
i64: 0000000000000000000000000000000001001001100101100000001011010010
i8:  0000000000000000000000000000000001001001100101100000001011010010

C:\temp>

Thus the point being the earlier approach using C_F_POINTER is a shortcut to doing the same thru C and with suitably defined types, it is defined behavior also.

So storage addresses are not memory addresses! What semantics is this ? Where do you suggest information is stored ? I would suggest that the same storage units are actually referenced by their memory location, ie address.

I actually learnt Fortran in 1973 and for the next 20 years, hardware supplier’s documents were the main reference. There were many extensions to Fortran that were necessary to implement effective computation, which were described in these documents. These extensions were provided to improve Fortran’s functionality, something which is now less important. (Actually the two biggest changes to Fortran usage for me have been graphics and OpenMP, both of which are not in the standard)

As I have said, I have not looked up equivalence in the standard until this post (I have still not referenced a standard, but relied on Lahey’s Fortran 95 Language Reference). I have never seen any compiler reference to this change until the recent NAG example.

Based on my understanding of what functionality EQUIVALENCE provides, there is no justification for these subsequent restrictions that have been applied. They are an unnecessary restriction that does not achieve any benefit. I would love to see the argument for their inclusion in the standard.

I do recall past complaints about Equivalence in Fortran, one by a C programmer who had not bothered to understand what equivalence was. Is this where we are now at; Fortran is being changed because C exponents don’t like it! It’s a shame that users of Fortran are not included.

Anyway, I would love to see the justifications for the changes to equivalence in the standard and what user feedback was requested.

What changes to equivalence are you referring to? I think it’s status has changed, but its specification has not changed.

@RonShepard, we are refering to possible changes to EQUIVALENCE that go back to F90 or F95.
This was introduced in post #10 by @FortranFan, then questioned about compliance in #11 by @everythingfunctional and also @kargl in a post I can no longer find (hidden?).

Apparently you can’t mix different type or kind in EQUIVALENCE, which is a restriction I was never aware of, or have never received a warning from a compiler I have used.

One of the main uses I have made of equivalence has been to access bytes in integer arrays of various kinds, and also access bytes or memory addresses as unsigned integers. This is potentially excluded by F9?, but interestingly acknowledged as a possibility in the Lahey F95 Language documentation.

This is a common functionality that I have used for many years. It also appeares in WinAPI documentation and code examples, so I am further amazed that the Standard Committee would have gone down this path.

I really do not know what is the problem in using EQUIVALENCE. EQUIVALENCE is a basic concept, so I have never referred to the standard for clarification.
I recall other posts from C users, where apparently Fortran EQUIVALENCE is unexpected in it’s behaviour.
Hardly sufficient reason to remove it from Fortran !

My apologies, as I had forgot about 16-byte integers.

When were they introduced and are they hardware supported (any hardware instructions) ?
It is certainly scary to contemplate an X128 OS !
Won’t get that with DDR5 memory.

No, there might be 5. gfortran has them, ifort does not. The Standard only requires 64-bit integers

program kinds2
  use iso_fortran_env
  print *, integer_kinds
  print *, real_kinds
end program kinds2
$ gfortran-12 t.f90 && ./a.out
           1           2           4           8          16
           4           8          10          16
$ ifort t.f90 && ./a.out
           1           2           4           8
           4           8          16

As for C interoperability, there is no support for 128-bit integers in iso_c_bindng module. And it could not be any, as neither gcc nor icc/icx does support them.

The term is storage unit, not address. And they might be in memory on a computer, but the standard never uses those terms. It doesn’t even use the concept of file systems or interactivity beyond individual files and possibly pre-connected input, output and error units. Nor does it use the term compiler. For all it matters to the standard, a standards conforming processor could be a student at a blackboard.

Here you fully acknowledge knowingly going outside the standard. Your use of EQUIVALENCE happens to also fall outside the standard, whether you realised it at the time or not. It has never been my position that one should never do that, just that you should realise it and understand the risks associated with it.

The advances of computer science over the last half century have been largely about discovering error prone patterns and designing languages in ways to limit the chances of making those errors. The quintessential example being the removal of goto from nearly every language still in use. Static type systems are another invention that turned out to be highly beneficial. It was determined that EQUIVALENCE was a largely error prone backdoor around the type system, hence the restrictions and it eventually being declared obsolescent to discourage its future use. If you’re fully aware of the risks and confident in your abilities to use it safely, go right ahead, but the rest of us would like to let future new comers to the language know that “this thing is potentially dangerous.”

3 Likes

+1, very well-written summary, thank you for this!

Thank you for the example. I had suspected that CFI_cdesc_t would be required. I’ll note that this still seems to go around the type system in a way that C cannot check, as CFI_establish takes a void * as the second argument. Is there any way to get a copy of the C standard to check what it says about these kinds of things?

PS, the IS_BIG_ENDIAN trick is neat. I suspect very helpful in lots of situations.

@JohnCampbell ,

To try to add to the previous comments and to get back to your original post where you express a certain use case which then compels you toward EQUIVALENCE, as difficult as it may be to accept, you may again want to take note:

  1. EQUIVALENCE is as seen error-prone and labeled obsolescent, there is likely no turning back on this and as explained to you by @everythingfunctional , that is a good thing for quite a few other practitioners as well as compiler implementations looking into the future.
  2. However you will find several existing compiler implementations such as gfortran and Intel Fortran to continue supporting nonstandard extensions to EQUIVALENCE in perpetuity, as pointed out to you by @everythingfunctional , so again you have good options since you are comfortable with using nonstandard extensions. I tried an example shown below and both gfortran and Intel Fortran appear to enable programs that work as you would expect with them.
  3. More importantly though, if you share more details with perhaps some actual code, far more than on what your original post gave, chances are high many of the readers here, especially @everythingfunctional et al. with experience with modernizing legacy codebases, will be able to point out to you how to approach and solve your problem differently and which might then render moot the whole issue about EQUIVALENCE and type punning / pointer casting, etc. I suggest you give that a try.
  • Example with nonstandard EQUIVALENCE usage
module data_m
   use, intrinsic :: iso_fortran_env, only : B1 => INT8, B4 => INT32
   integer, parameter :: MAXDAT = 3 !<-- Arbitrary value
   integer(B4) :: buffer(MAXDAT)  
   integer(B1) :: sub_records(4*MAXDAT)
   equivalence( buffer, sub_records )
contains
   subroutine load_data()
      ! Simulate here the reading of the data from source, file or a database, etc. as
      ! a naive assignment
      buffer = [ int( b"10101010101010101010101010101010", kind=kind(buffer) ), &
                 int( b"10101010101010101010101010101010", kind=kind(buffer) ), &
                 int( b"10101010101010101010101010101010", kind=kind(buffer) ) ]
      print *, "In load_data: buffer = "
      print "(*(b32:,'; '))", buffer
   end subroutine
   subroutine consume_data()
      print *, "In consume_data: sub_records = "
      print "(*(b8:,'; '))", sub_records
   end subroutine 
end module
   use data_m
   call load_data()
   call consume_data()
end
  • Program response using gfortran
C:\temp>gfortran p.f90 -o p.exe

C:\temp>p.exe
 In load_data: buffer =
10101010101010101010101010101010; 10101010101010101010101010101010; 10101010101010101010101010101010
 In consume_data: sub_records =
10101010; 10101010; 10101010; 10101010; 10101010; 10101010; 10101010; 10101010; 10101010; 10101010; 10101010; 10101010

C:\temp>
  • Program response using Intel Fortran
C:\temp>ifort /standard-semantics p.f90
Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.7.0 Build 20220726_000000
Copyright (C) 1985-2022 Intel Corporation.  All rights reserved.

Microsoft (R) Incremental Linker Version 14.33.31630.0
Copyright (C) Microsoft Corporation.  All rights reserved.

-out:p.exe
-subsystem:console
p.obj

C:\temp>p.exe
 In load_data: buffer =
10101010101010101010101010101010; 10101010101010101010101010101010; 10101010101010101010101010101010
 In consume_data: sub_records =
10101010; 10101010; 10101010; 10101010; 10101010; 10101010; 10101010; 10101010; 10101010; 10101010; 10101010; 10101010

C:\temp>

I agree it is entirely going around the type system and as such, I would only bring this up in certain specific situations such as in C language with down casting integer pointers of different widths (KINDs per Fortran terminology) and also C char types given its legacy of byte equivalence (bad pun intended!) with a certain width integer type (usually 8 bits). It is for such specific situations only that my takeaway has been the C language provides defined behavior in terms of bit representation even though I didn’t find it stated directly as such in their language semantics.

For the C standard, I usually suggest this link for an open draft PDF copy…

OP, starting with the original post, indicated working with integers of different kinds., “To merge the sub records, I am converting the buffer “rec_data”, which is provided as 4-byte (default) integers to 1-byte integer vector so the sub-records can be easily merged.” It was only because of that I felt this option would be a reasonable choice.

1 Like

This was discussed earlier in this thread, although it is now getting difficult to locate specific posts because of the long thread length. Prior to f90, there was only one integer type, one logical type, one complex type, and two real types. F77 also allowed only one real type to be supported as a subset, but I’ll ignore that complication here. F77 also had characters, but it did not define how character storage sequences mixed with the numeric storage sequences. The f77 standard allowed all of those types to be equivalenced and to be storage associated in common blocks. It defined that association through numeric storage units, not through addresses. Namely, everything was one storage unit except for DOUBLE PRECISION and COMPLEX which occupied two storage units each.

Type punning through dummy argument lists is related to this, but its behavior was never defined by the standard. Fortran programmers of that era did this all the time in order to reuse workspace when memory was limited and for low-level bit fiddling. I certainly did it, and I have legacy code that still does it. But it was not defined by the standard, and there was always the possibility that a standard conforming compiler would either reject it, or would accept it and not do what I expected. As a practical matter, that never happened. Compilers always did what was expected in these cases, and if they did detect the standards violation and print a warning, there were always ways to ignore or override those warnings.

Pre f90 compilers often supported other types too. The common notation was LOGICAL*1, INTEGER*2, INTEGER*8, REAL*16, COMPLEX*16, and so on. These were never part of the fortran standard, and the storage sequence association of these types were never defined by the standard. It was possible to write portable code that used these declarations, but the programmer had to be careful. On the other hand, the computer vendors loved it when programmers used these extensions because it locked in their customers to keep buying their hardware and software. Thus the dilemma that was faced by us programmers of that era.

When f90 introduced the KIND system, it replaced all of that nonstandard notation with a general and open-ended approach. I’m a big fan of the fortran KIND system. So now it was possible to decalare and use all of these different types and kinds, but the programmer must still be careful because of portability. There were still only one integer kind and two real kinds that were required by the f90 standard, everything else is extra. More recently, the standard requires also support for an extended integer kind (at least 18 decimal digits). But the fortran standard has never defined the behavior of either EQUIVALENCE or storage sequence association for all of these different TYPES and KINDS, beyond what was specified for the default intrinsic types in the previous standards. So it looks like we have that only because of backwards compatibility with the pre-f90 standards.

So I don’t think it is correct to say that the fortran standard has imposed new restrictions on EQUIVALENCE. It is rather than they have continued on with what was in the previous standards without extending it to the newer type-kind system. Now EQUIVALENCE has been designated as obsolete. Common blocks are also on the black list, either obsolete or deprecated, I forget which. That is how the language has been evolving over the last 60 years.

If FortranFan is correct in his interpretations of the fortran and C standards, the C interoperability facilities actually provide some standard-conforming ways to effectively equivalence types and kinds that were not covered with the backwards-compatibilty of EQUIVALENCE and storgage sequence requirements. TRANSFER() also does some of this in ways that are both more restrictive and more general than the old approaches. In a separate thread, I pointed out that if the intrinsic subroutine MVBITS() were generalized only a little, then it would allow entirely portable code to be written that bypassed even the little- and big-endian addressing problems that are inherent in the type-punning, equivalence, and storage association approaches. So there are ways for fortran to move forward in positive and productive ways from its present position.

1 Like

EQUIVALENCE has not been designated as obsolete. In f2018 it became obsolescent, meaning that it was not recommended, for reasons given in the standard, but it can still be used in a standard-conforming way. Nothing is described as obsolete in f2018, though some things valid in earlier standards have been deleted.

1 Like