Simple elemental sequence association?

In updating virtually any large legacy package the use of scalars as
arrays and passing arrays with rank mismatches arises. Even widely
distributed packages (BLAS, LAPACK, ODEPACK, …) assume this is OK; it
was defacto a Fortran feature; but was not standard. It now complicates
modernizing a lot of packages, at least if you want to include all the
code in modules.

Sometimes there is a straight-forward option. If passing a scalar as
an INTENT(IN) array, place the value in brackets and create a temporary
array. If there are not a bunch of permutations of the arguments required,
create a generic interface, or create an ISO_C_BINDING interface to
memcopy(3c) or otherwise use C interface features. If just a shape
problem pass the array sizes and use them to dimension the arrays,
circumventing array bounds checks, … but without going on too much
more it can get complicated and is error prone to correct.

So assuming the standard will not change this soon, and given the
limitations of assumed-rank and assumed-shape arrays, is there a simpler
way to port code like this so it can be placed in standard-conforming
modules with less effort?

There are instances where things are much simpler if one can change the
rank of an array (or scalar) when it is passed to a procedure, as was
so often done in older code. If you can determine it is INTENT(IN) and
the overhead of a (probable) temporary array is acceptable RESHAPE(3f)
and PACK(3f) can be used, along with scratch variables but that is often
not the case. Elemental procedures, generic procedures, TRANSFER, SELECT
RANK, assumed size, array element sequence association, scratch arrays
and functions to reshape variables, … none of them supplies a simple
generic solution that I have thought of.

I keep leaving the more complex routines in older code outside of modules
for very extended times (maybe forever?) because of the complexity
required just to get sequence association in a module and because that is
so common in old code with dozens of arguments on their procedures.

As an exercise, I made a subroutine that would copy elements from IN
to OUT until the end of one of them is hit (the equivalent of a simple
memcopy(3c)) for scalars and/or arrays up to nine dimensions and it
took me almost 100 lines if I required not using C interfaces and that
it must be able to be placed in a module.

Hopefully, I am missing something simple?

1 Like

In legacy code, f77 and earlier, you could pass an array element actual argument to a scalar dummy argument or to an array dummy argument. But it was never allowed to pass a scalar actual argument to an array dummy argument, not even to an array of size one. Many fortran compilers would warn you if you did this, along with utilities like FNTCHK, but there were many cases where it was not possible for the compiler to detect this kind of error. Now in modern fortran, routines that have explicit interfaces help the compiler to detect more of these errors. I don’t think there are any work arounds for these types of errors, you just have to fix the code. This is different from rank mismatches for arrays. Legacy code relied on storage sequence association because there was no ALLOCATE statement or automatic array facility that would provide arrays of the correct size or temporary work arrays of the right size. Again, this problem is now solved with modern fortran.

What is simple is in the minds of the practitioner. An option is C interoperability features as alluded to above, readers can decide whether that is simple based on the illustration below. Given the relatively good availability of a companion C processor to go with Fortran processors, it is reasonably usable on most platforms.

Note here though certain refactoring is inevitable if modern Fortran i.e., mainly explicit interfaces are desirable for such legacy packages.

Say there is a legacy package with a subroutine with an assumed-size dummy argument of rank-1 like so:

      SUBROUTINE SUB(X, N)
         INTEGER X(*)
         INTEGER N 
         INTEGER I
         DO I = 1, N
            X(I) = 42
         END DO
      END SUBROUTINE  

which is consumer by the following caller without an explicit interface of course and the non-standard practice of scalar actual argument associated with the rank-1 dummy argument, a common enough scenario:

      INTEGER A
      CALL SUB(A, 1)
      PRINT *, "A = ", A
      END

Almost all the processors will help create programs that appear alright to all the legacy users:

C:\Temp>gfortran -c -std=legacy f.for

C:\Temp>gfortran -c -std=legacy p.for

C:\Temp>gfortran p.o f.o -o p.exe

C:\Temp>p.exe
 A =           42

Now slap an explicit interface to the subprogram without much thought and a modern processor will complain:

    3 |    CALL SUB(A, 1)
      |            1
Error: Rank mismatch in argument 'x' at (1) (rank-1 and scalar)

But instead, say an author is willing to refactor a little and is seeking certain benefits starting with explicit interfaces, then a workaround under the above circumstances can be a generic interface to go with such an explicit interface. Here an additional specific procedure is introduced to support the rank-0 i.e., scalar dummy argument use case by the library’s consumers. Perhaps it may be along the following lines using the interoperability with C feature:

module m
   interface sub !<-- generic interface
      module procedure sub
      module procedure sub_s  !<-- introduce a specific procedure to support rank-0 callers
   end interface 
contains
   subroutine sub(x, n)  !<-- original "library" procedure, minimally refactored for free-form source
      integer x(*)
      integer n 
      integer i
      do i = 1, n
         x(i) = 42
      end do
   end subroutine 
   subroutine sub_s(x, n)
      use, intrinsic :: iso_c_binding, only : c_loc, c_f_pointer
      integer, target :: x
      integer n
      integer, pointer :: px(:) 
      ! elided is the assertion for n==1      
      call c_f_pointer( cptr=c_loc(x), fptr=px, shape=[ n ] )
      call sub( px(1:n), n )
   end subroutine 
end module

The caller too then refactors i.e., introduces USE association:

      USE M, ONLY : SUB
      INTEGER A
      CALL SUB(A, 1)
      PRINT *, "A = ", A
      END

and some progress is attained:

C:\Temp>gfortran -c m.f90

C:\Temp>gfortran -c p.for

C:\Temp>gfortran p.o m.o -o p.exe

C:\Temp>p.exe
 A =           42

But of course the above workaround can get unwieldy with multiple array dummy arguments. No pain, no gain.

I have had many tries to envigorate old code, but minimise the error/warning messages modern compilers provide.

My approach to updating legacy code is to replace the legacy way of cutting up a common work(9999), so to progressively replace these “bits” with allocatable arrays.
This replaces the calculation of the offsets:
i1=.1…; i2= i1+ n*m ; i3 = i2 …
with array allocations:
allocate ( a1(n1,m1) ) ; allocate ( a2(n2,m2) ) ; …
I think the use of ALLOCATE is a better documention than the calcilations of the i’s.

While you do these updates, array WORK can just remain at 1980’s size and not affect

There remain a few areas where the ALLOCATE approach does not work.

  1. Array size 1 : As FortranFan Identifies, where the array size is 1, a constant was sometimes used as an argument, which is now an identified error, such as for write a list of values, when the list size is 1. My solution has been provide two routines for a scalar or array.

  2. Arrays are contiguous : In pre F90 times, arrays were always contiguous, but efficient use of array sections has meant this approach has changed, such as in “matmul ( array(1,:,: ) , array(3,:,: ) )” This breaks the basic legacy approach of assuming any array is a continous block of memory.

  3. Array size of unknown : Often when the array size was unknown, you would use
    call get_info ( work(iend), n_found )
    where “iend” pointed to the end of the used part of “work” and n_found was returned, to be used to update the slicing of “work” for the next data input phase to continue.
    The justification for this approach was that if this data overflowed “work” the rest of the analysis would fail ; else an available size limit could also be provided.

ALLOCATE has not addressed this issue.

Sometimes there were two lists of data to be recovered, integer and real values, so the same array layout could be used with appropriate subscript sizes to make this possible; eg a set of items with 6 integer properties and 4 real properties (56 bytes of info per point) could be achieved by:

call get_info ( work(iend), work(iend), n_found )

subroutine get_info ( int_list, real_list, n_found )
integer(4) int_list(14,* )
real(8) real_list(-2:4,* )

This approach works successfully for n_found over a large range, say 10 to 100 million points, defering the decision of how many points might be in a data set until the info is available.
This approach can generate many errors and warnings and I am sure the Fortran purists would disagree, but when a simple solution is required, convenient approaches based on proven experience, are required.

So my recommendation for modifying legacy code is replace what you can with ALLOCATE and see what is left or what error messages are left.

It is unfortunate that these legacy approaches are now so discredited, as they were the basis of many world leading computation approaches of the 60’s and 70’s that I have learnt so much from.

Thanks for all the replies. I think I will take some time to finish up some of the outliers in a few packages I was standardizing.