Can we reinstate EQUIVALENCE?

The contributors to this thread have discussed bit manipulation and the organisation of COMMON blocks. Thank you. But for us, the most important use of EQUIVALENCE is in setting up a paged data structure. To explain:

  • fpt analyses large programs often over a million lines long.
  • A large array (about 1.5Gb) is declared in a module.
  • The tables which describe the program are all mapped across the entire length of the array by EQUIVALENCE statements. For example, the cross-reference table has 8-byte records and the indices are (1:17640000), the symbol table has 320 byte records and the indices run from 1 to 4410000.
  • The array is divided into pages 201600 bytes long. Each page holds data from only one of the 27 tables used to analyse the code.
  • The token stream which represents the program is also mapped into the array.

There are two advantages in this approach:

  1. No table can run out of space until the entire array is full. We don’t have to set a total size for any table.
  2. The tables grow in such a way that new data in different tables are often close together, improving cache coherence.

It is, of course, possible to grow tables using move_alloc or to allocate records maintaining pointers. However there is a significant speed penalty and fpt users are impatient.

So how do I set up a paged data structure without EQUIVALENCE?

Here is one approach. I do not think this is strictly standard conforming, but then neither is equivalence for this purpose.

Ok, so suppose you have the underlying base array, and you want to alias an array of a different type to some of the elements in that array. You can use the combination of c_loc() to get addresses, followed by c_f_pointer() to alias those addresses to a fortran pointer of the desired type. You are using C interop features, but you aren’t writing any C code, everything is in fortran.

There are some technical problems with this approach. First, there are the hardware address problems mentioned previously, e.g. if you alias a real64 or an int64 entity to an odd address, then you are likely to have problems on some hardware. This is the same problem with equivalence, but while equivalence is a static compile-time alias, this is now a run time alias, so that introduces some additional pitfalls to avoid. Second, this is standard conforming only for fortran variables that map to C variables through the C interop features. So that means that you use the c_int, c_short, c_long, c_float, c_double, c_long_double, and so on parameters in in your codes from iso_c_binding, even though you aren’t actually interfacing to any C code. And if you want to equivalence a real16 array, or something else that is not covered by iso_c_binding, then you are flying under the pirate banner again.

@RonShepard - Thank you.
No problem with REAL(16) or REAL(10) - we can EQUIVALENCE them to COMPLEX(8). Oh dear!

However this seems an awkward runtime solution with possible performance penalties.

I think that we still need EQIVALENCE in modern Fortran.

1 Like

My previous reference to real16 was to 16-bit reals, which would equivalence to int16 16-bit integers. Those are not yet included in iso_c_binding, and they are not covered by any aliasing rules within fortran itself, so the programmer is on his own without standard support from either C interop or EQUIVALENCE. There are two defacto standard floating point formats, fp16 (5-bit exponent, 10-bit fraction) and bfloat16 (8-bit exponent, 7-bit fraction). fp16 is the IEEE standard, while bfloat16 is the google AI standard. NAG fortran supports the IEEE format, Intel MKL supports the bfloat16 format. Other software and hardware vendors are also beginning to support one or both formats. The fortran kind system is flexible enough to support both formats at the same time, but for some reason fortran compilers seem to be slow to incorporate the short formats. Besides the AI and ML applications which seem to be driving this development, these short formats would also be useful in numerical methods in, for example, multigrid methods or as iterative preconditioners. One would think that fortran would be the language leading this adoption and with the most development activity.

This reminds me of the situation with two VAX 64-bit floating point formats, one with an 8-bit exponent and the other with an 11-bit exponent, both supported by the hardware and by mathematical libraries. In the mid and late 1980s, in the time of fortran 8x, 88, and eventually 90, it seemed like an easy layup shot for VAX fortran to take advantage of that situation, but it never happened. In fact, there was never a f90 fortran compiler released by DEC for the VAX.

I totally agree. I think the proposal to remove EQUIVALENCE was flawed.
Is EQUIVALENCE any worse than creating dangling pointers ?

1 Like

I don’t.

Some time ago, I had to refactor a FORTRAN 66 code. Despite its small size (a few hundred lines) I got tired of cleaning gotos and arithmetic ifs. But even the authors of this old code back then knew that they’d better avoid the use of equivalence.

Yet, here we are in 2026 and Fortran users still can’t let go. The solution to the OP’s problem needs to be sought in improving the (memory management of the) tool he is working on. Not in attempting to reinstate an unsafe language feature that for good reasons has been declared obsolescent.

All modern languages since Java have moved away from pointers. To argue that one unsafe feature isn’t worse than some other unsafe feature isn’t helpful.

1 Like

Thank you.

Do you have a suggestion for improving the memory management beyond the paged data structure which we currently use? Some of the tables occupy many tens, even hundreds of thousands of bytes, MOVE_ALLOC leaves large air bubbles in the memory which in this case would be counter-productive. We tested the use of pointers and the performance was disappointing. We don’t want to guess the size trade-offs between different tables, because these vary widely with the style of code under analysis. All suggestions are welcome!

Please could you explain the ‘good reasons’ for which EQUIVALENCE has been declared obsolescent?

I think @Machalot and @rwmsu already gave enough reasons in this thread why equivalence is unsafe. I agree with them.

Regarding your particular problem, I am sorry but I don’t have any specific advice for you. That may require a consultant who can take a look into your (I presume) proprietary code base.

In any case, your description sounds to me like that of an implementation rather than a language issue. Because (Fortran) compilers that are written in other languages do not seem to have the problems that you reported with the use of pointers.

Somehow, I missed this discussion earlier, but I am confident that EQUIVALENCE won’t be deleted from the language and that compilers will continue to support it regardless. I agree with those who say EQUIVALENCE is a bad programming practice, but if you have existing code using it feel free to keep it.

1 Like

Just curious. Was this prior to the introduction of the CONTIGUOUS attribute for pointers. Based on what I think you are saying about your memory layout, you are accessing contiguous chunks of the underlying array. True?

You are quite right that we are accessing contiguous chunks of the underlying array. But a single table will be spread across multiple chunks with pieces of other tables in-between. To use the CONTIGUOUS attribute we would need pointers to the chunks (the pages) and pointers within them, causing a double-bounce memory reference.

The records of the different tables are of different sizes. A simple way to handle all this would be to allocate every record of every table when it is needed. I think the processor would spend most of its life in the allocation mechanism and the performance would be poor. The paged data structure allows each record to be accessed by a simple array index.

A detail which I didn’t mention before is that the token stream which describes the code is also mapped into the array. When the array is completely filled, pages of the token stream are paged out to a direct access disk file. To do this we have to divide the array into pages. The system is modelled on the paging mechanism of VMS. fpt started life in 1989 on a VAX.

Agree… It is already marked as “obsolete”, which means “do note write new codes with it”. But There are so many legacy codes around that use it, that it doesn’t sound not reasonable to delete it at the moment from the standard. In contrast to arithmetic if’s or goto’s, getting rid of equivalences in an existing code can be a quite difficult task.

I have no fear that EQUIVALENCE will disappear any time this century. My concern is with the standard. There are things which can be done with EQUIVALENCE which, judging by the responses in this thread, cannot be done any other way.

@kkifonidis wrote that “Because (Fortran) compilers that are written in other languages do not seem to have the problems that you reported with the use of pointers.” It takes 15 minutes for ifx to build WRF. fpt can analyse it, re-engineer it and find thousands of anomalies in 2 minutes. That is because it is written in Fortran not C, and uses features ‘obsolescent’ in the Fortran language. Should they be?

1 Like

The EQUIVALENCE and COMMON features were introduced for a world without derived types and modules.

Since you mentioned MOVE_ALLOC a few times elsethread, there’s already an underlying malloc in your implementation (with its own paging rules for contiguous/non-contiguous, and alignment-related stuff). So, you’re providing your own paging on top of another paging.

Maybe biting the bullet and splitting-into/migrating-to a dynamic array pattern is not such a bad idea. Something like:

type :: table
    ...
end type
type :: array
    integer :: sz = 0, cap = 750000
    type(table), allocatable :: data(:)
end type

(I apologize if my suggested implementation is too naive, but as @kkifonidis mentioned, not much can be done without looking at your (presumably) proprietary code base.)

It is 96 bits in 32-bit executables. In COMMONs it may depend on ‘-f[no-]align-commons’ flag (default is ON).

1 Like

He might also consider something like a C++ vector class implementation for table. I presume there are several available for Fortran if you look. I like @arjens implementation in his FLIBS software, at least as a place to start. That would cut down on a lot of allocations. You might want to track total memory use but thats not hard.

2 Likes

What I posted was Go-inspired, but I think C++'s std::vector does a similar thing: Since the C++ standard requires it to be contiguous, there must be an underlying array (not a linked list). And to avoid too many malloc invocations, there should be some array capacity somewhere.

1 Like

No, we experimented with MOVE_ALLOC and rejected it on grounds of efficiency. We only use one paging mechanism, which, by the way, was written in 1989 on a VAX. It works fine with sequence derived types (you need to be sure that there are no air bubbles in the records).

It’s not fully clear to me from your different post, but it seems that you are EQUIVALENCE-ing between types that are not the mandatory types/kinds of the Fortran standard ? If yes, your code is non-standard anyway, and it wouldn’t be even if EQUIVALENCE was fully resinstated…

How did you use MOVE_ALLOC ? Do you realloc/movealloc each time you need to modify the size? If yes, of course it is non efficient. A strategy Ă  la C++ std:vector is needed here.

Yes, you are quite right that our code would not be fully standard even if EQUIVALENCE were re-instated. Nearly all the records are made up of 4-byte integers, but there are a few 8-byte integers and one table contains 16, 8 and 4-byte reals. The token-stream tokens contain four 1-byte integers and two four byte integers - 12 bytes in all. It is designed for speed and efficiency, not for strict standard conformance. It has been built with at least 6 different compilers - gfortran, ifort, ifx, CVF, FTN95, HP-UX, Therefore we have avoided some modern Fortran constructs which might not be supported, and also MAP and UNION.

My issue with EQUIVALENCE is more with the Fortran Standard than with our own code. We re-engineer programs and sometimes generate new EQUIVALENCE statements. This occurs, for example, when COMMON blocks are replaced by modules and the COMMON block is mapped differently in different routines. This situation is an implied equivalence which we make explicit. But I do not like generating non-standard or obsolescent code.

We tested MOVE_ALLOC in a relatively simple way. Whenever we ran out of space in a table we reallocated it, usually doubling the size. This is very expensive, but I think that anything which moves large amounts of data will be. We could perhaps have spent more time on this, but our existing paging mechanism was already in place and we were probably less motivated than we should have been.