C_f_pointer() borderline interpretation

In 18.2.3.3 C_F_POINTER (CPTR, FPTR [, SHAPE, LOWER]) of the 2023 standard, there are a lot of constraints on what CPTR and FPTR can be.

In particular, if writing something like call c_f_pointer( c_loc(X), Y ) these constraints enforce X and Y to have the same type/kind.

However, the third allowed usage case says:

What does mean precisely “a storage sequence that is not is use by any other Fortran entity”?

Is the following code legal? It associates a real pointer aa to what is at the beginning an integer pointer ii. But at the time the association is made, the function has returned, ii is out of scope, and the underlying storage is no longer accessible through ii… So, one could say that the storage is not in use by any other Fortran entity, right?

program test
use iso_c_binding
implicit none

real, pointer :: aa(:)

call c_f_pointer(get_pointer(1000),aa,[1000])

contains

    type(c_ptr) function get_pointer(n)
    integer, intent(in) :: n
    integer, pointer :: ii(:)

    allocate( ii(n) )
    get_pointer = c_loc(i)

    end function

end program 

The way I read that statement is that the storage sequence of case (iii) has been created outside of the Fortran language, i.e., from a C library or another external C-interoperable library. Makes sense?

It does, and it was also my understanding. But I wonder if it’s not a bit broader than that.

I’ve read what items (i) and (ii) say. (i) is about interoperable entities, (ii) is about non-interoperable ones. I thnk your case falls under either (i) or (ii) (if we consider that the default integer may not be interoperable). Here, it says:

If the value of CPTR is the result of a reference to C_LOC with a noninteroperable effective argument X, FPTR shall be a nonpolymorphic pointer […]. In this case, X shall not have been deallocated or have become undefined due to execution of a RETURN or END statement since the reference.[…]

It would seem to me that your case falls under (ii) (non-interoperable entity), but because it’s become undefined, it violates the standard.

Yes, but the 3 cases are not exclusive, it’s “if / if / if”, not “if / else-if / else-if”, so my code can fit both case(ii) and case(iii).

Yep, it’s a borderline interpretation, like you suggest

The C_LOC actual argument has a typo, it should be ii.Otherwise, I think your interpretation is correct and the code is conforming. The memory leak can be plugged by ending program test with a deallocate(aa) (which would be important if test was a subprogram unit).

1 Like

The Intel compiler fails deallocating aa though: Compiler Explorer

gfortran and llvm flang succeed.

From 9.7.3.3 Deallocation of pointer targets:

If a pointer appears in a DEALLOCATE statement, it shall be associated with the whole of an object that was created by allocation. The pointer shall have the same dynamic type and type parameters as the allocated object,

So my reading is that the original example is valid for the reasons stated, but that in order to deallocate the memory, it must be associated with a pointer of the same type as it was originally allocated with.

1 Like

Correct, I missed that.

Instead of deallocating aa(:) directly, I think one would need to do the alias trick backwards and deallocate an integer pointer that has been aliased to the aa(:) array. That is, the pointer metadata associated with a real array might differ in some significant detail from the pointer metadata associated with an integer array.

My impression of the above sentence is like “a storage sequence that is not created on the Fortran side” (i.e. the same as Federico in the 2nd post), and I guess the author might have avoided the word “create” or “allocate” etc because c_loc() could be used for module variables or other (static) variables (so intentionally making the sentence very “abstract”?)

1 Like

Maybe. But then why not writing it like that?

My feeling is that the authors of the standard are trying hard avoiding any possible storage association between objects of different types/kinds (beyond what was historically allowed in pre-F90), and that it gets tricky once the flexibilty of the C pointers are introduced in the langage.

Yes. That is what it means to be a type-safe language. Fortran is (to a large degree) a type-safe language.

Case (iii) has weird implications, though…

This is non-conforming:

integer, pointer :: ii(:)
real, pointer :: aa(:)
allocate( ii(n) )
call c_f_pointer(c_loc(ii),aa,[n])

This is conforming:

integer, pointer :: ii(:)
real, pointer :: aa(:)
type(c_ptr) :: p
allocate( ii(n) )
p = c_loc(ii)
ii => null()
call c_f_pointer(p,aa,[n])

Provided the last 8 words of Case (iii) hold, which gives compilers some latitude, if they want to be really strict about type-punning. They could object (runtime error) at the c_f_pointer call and not wait to see the Fortran pointer’s target referenced.

At least, that’s how I interpret it.

I imagine that some compilers may deny an association because of an improper alignment for instance. But it could also happen with a C pointer really coming from a C code.

The standard already has some discussion about why the following is not valid. The definition of a variable that is storage associated with variable(s) of a different type and/or kind causes those other variables to become undefined. This prevents the standard from having to say anything about internal representations.

integer :: a
real :: b
equivalence a, b ! this is valid, but a bad idea
b = 3.14
a = 42
print *, b ! invalid, the assignment to a causes b to become undefined
end

This is why equivalence has been declared obsolete. We have pointers now, and equivalence of objects of different type was a bad idea in the first place. The current restrictions on c_f_pointer/c_loc are to prevent you from putting yourself back in that situation. I.e.

use iso_c_binding
integer, pointer :: a
real, pointer :: b
allocate(a)
call c_f_pointer(c_loc(a), b) ! just say this is invalid to eliminate the footgun
b = 3.14
a = 42
print *, b
end

Except if one is a default real and the other one a default complex :wink:

It was originally a good idea, to save memory back in times where memory was a scarce resource and where no dynamic allocation was possible.

With modern machines that full of RAM + dynamic allocation, the need of storage association between different types is much less critical. But there is one domain where it can still make sense: HPC. In some domains there is never enough memory, and sometimes I’d like to have TBs of memory on one machine (and well, I had that 3 years ago: 3TB of Intel Optane SSDs transparently configured as RAM)…

I must admit, though, that for decades I haven’t really felt the necessity of inter-type storage assocation (*)… Except between the real and complex type, which I use on a regular basis: hence the proposal you are aware of.

(*) that said, by the meantime C++ has had the reinterpret_cast operator for that purpose.

I remember the use of equivalence in code from Cern.

Modifing the code for local circumstances was relatively straight forward after i realised what was happening in the equivalence statements.

Various subroutines and functions were developed by people from different parts of Europe.

Hence the equivalence of a variable called word (in a routine developed by some with an English background) and a variable called mot (in a routine developed by someone from France)

One of our technical writers spoke several European languages and we sat down together and drew up a short dictionary of variable names in equivalence statements.