Can we reinstate EQUIVALENCE?

We have a code base of tens of millions of lines.
Just over 70% of the packages contain EQUIVALENCE statements.

The uses we identified are:

  • Organising data for communication between processes. One package used in financial control and analysis contained over 20,000 EQUIVALENCE statements mapping named variables into arrays (I was surprised so I checked!)
  • Organising COMMON blocks. Several aircraft simulations have between 2000 and 4000 EQUIVALENCE statements for this purpose.
  • Providing short local names for components of derived types and structures. For example EQUIVALENCE (pitch,attitude%pitch)
  • Building data structures for attached hardware, essentially re-arranging bytes. This could be done by other means, but as pointed out by @RonShepard and others this can be awkward.
  • Implementing paged data structures. This is important in our work. In explanation, fpt analyses code and stores data in multiple tables (Currently there are 27). All these tables are overlaid into the same large array. The array is divided into pages. When,for example, a new symbol table record is needed a counter is incremented. When this reaches the end of the current page a new page is used (not allocated) for continuation of the table. There are several advantages of this process. i. We don’t specify a maximum size for any table so no table can run out of space until we eventually run out of memory in the shared array. ii. As data from a large program grows, elements of the tables which describe it remain fairly close in memory. This probably helps cache handling. iii. Memory is not allocated by ALLOCATE constructs so we avoid a double reference to find, for example, a symbol table, cross-reference table or sub-program table reference. The 27 tables, and the token stream of the code are overlaid by EQUIVALENCE statements.

There must be a very good reason to remove EQUIVALENCE from the standard. However
I doubt if it will disappear from compilers or from widespread use. Can we reinstate it?

BTW my little search code is:

#!/bin/bash

for F in $(cat codes.txt);
do
  echo -n $F >> equiv.txt
  echo -n "   " >> equiv.txt
  cd $F
  grep -ir equivalence | wc -l >> ../equiv.txt
  cd ../
done

2 Likes

I certainly still use it. Much easier for emulating byte operations.
What compilers do not compile EQUIVALENCE correctly ?

Most of my codes still contain EQUIVALENCE and all build the last time I tried.

EQUIVALENCE is tagged as obsolete, which means it is discouraged in new codes. But I am pretty sure it won’t be deleted soon.

I never use EQUIVALENCE because of I find it extremely confusing and an artifact of a by-gone age when memory was small and hard to come by. It makes debugging code a nightmare (at least for me). Also, isn’t it a form of “aliasing” like pointers and can prohibit vectorization. I also don’t find Fortran’s intrinsic bit-flipping routines to be awkward to use etc. Like say Fortran’s support for CHARACTER data/strings, it just takes a little more thought on how you use them than the equivalent functionality in other languages. Doesn’t mean you can’t do things that other languages can do. You just might have to write your own usually small functions to provide the missing features.

2 Likes

I am finding that ASSOCIATE can be used to replace these local name aliases. Of course, if there are 20,000 of these in a code, one would need an automatic tool to do the conversion.

I would like to know the answer to this too. In some sense, ASSOCIATE is tamer than EQUIVALENCE, or POINTER, so maybe the compiler can produce better code or be more aggressive with optimizations?

My memory is not what it once was but wasn’t it mentioned in a recent thread here that some compilers appear to implement ASSOCIATE like they were pointers. My issue with ASSOCIATE is it can get very unwieldy and cumbersome in a hurry if you have more than a handful of items you want to rename.

If you have many items to rename, equivalences will be cumbersome too

This is where I’ve seen the most usage of equivalence as well. One of the codes I occasionally work on has three massive arrays (>10,000 elements) that live in a common block. To “import” a variable to a local scope, each element of the arrays is locally equivalenced to a local variable name. This is a maintenance nightmare because:

  • The array index is a magic number:
equivalence( xa(4522), local_var_name1) ! how do I keep all these indices straight?
equivalence( xa(967),  local_var_name2) 
equivalence( xa(1884), local_var_name3(1)) ! this is an array of dimension 100
! ... 
equivalence( xa(8875), local_var_name4)
  • There’s no way to guarantee the same array index refers to the same variable in different scopes:
! in subroutine aerodynamics.f
equivalence( xa(4523), airspeed)
! in subroutine propulsion.f
equivalence( xa(4532), airspeed)  ! oops, transposed digits in the index
  • It’s very susceptible to array collisions, e.g. where local_array has more than one element, this is valid Fortran but wrong semantics:
equivalence( xa(4522), local_array(1,1)) ! local_array has dimension 3x3
equivalence( xa(4523), local_scalar) ! oops, clobbered the elements of local_array
  • It’s hard to make changes without breaking things. For example, if I need to add a new array with 200 elements (50 timepoints of a 3-element vector), how do I easily find an unused 200-element section of the global array?
double precision local_array(50,4)
equivalence( xa(9015), local_array(1,1)) ! xa(9015) through xa(9214) must not be used anywhere else

A little over a decade ago we used some clever python scripts to operate on another (similar) codebase to replace all common blocks with modules and got the output to match to full double precision. There were no equivalence statements to deal with in that other codebase, but I imagine a similar update could be made where all equivalence statements are replaced by

use module_name, only: local_var_name

And then you have no more magic numbers or array collisions to worry about.

1 Like

Excellent analysis, thank you! Obvious missing functionality for me is:

  • equivalencing dummy arguments and local procedure arguments
  • equivalencing derived types (at least bind(C) or sequence types)

Plenty of tricks are available for dynamic casting, none for static casting like equivalence (in a very limited manner) allows.

EQUIVALENCE is not a static cast, its semantics is closer to that of union facility in the C standard.

C interoperable types with int32_t and int64_t and the use of memcpy from the C standard library is what Fortran practitioners may want to consider first for any actual uses cases in non-legacy codes.