Passing real8 array variable as 2D integer4 array

Hi everyone,

I am seeing certain practice that is confusing for me in some Fortran77 code. This code is about the so-called interpolative decompositions of matrices (you might also know them as CUR decompositions in other circles).

As a part of rewrite in SciPy, rather unwillingly I must confess, I’m trying to be faithful to the original codebase from Tygert et al. The source code is still available in author’s website Software (third entry)

We have been already warned by @certik in the old PR where this code has been added to SciPy; see Interpolative decomposition by inducer · Pull Request #2616 · scipy/scipy · GitHub

But I would like to understand the consequences or motivation, if you can tell, by the following code. I’ll remove the docstring for brevity here but the code is taken from here scipy/scipy/linalg/src/id_dist/src/id_rtrans.f at 86c3a5bf8d7bc965a7873b744127ee076a8c3fa1 · scipy/scipy · GitHub ;

        subroutine idd_random_transf_init(nsteps,n,w,keep)
        implicit real *8 (a-h,o-z)
        save
        dimension w(*)

        ninire=2
        ialbetas=10
        lalbetas=2*n*nsteps+10 
        iixs=ialbetas+lalbetas
        lixs=n*nsteps/ninire+10
        iww=iixs+lixs
        lww=2*n+n/4+20
        keep=iww+lww

        w(1)=ialbetas+0.1
        w(2)=iixs+0.1
        w(3)=nsteps+0.1
        w(4)=iww+0.1        
        w(5)=n+0.1

        call idd_random_transf_init0(nsteps,n,w(ialbetas),w(iixs))

        return
        end

But if we look at the definition of the subroutine called it is an implicitly typed 2D integer array. Does this mean every real *8 entry is considered to be two integer *4s?

        subroutine idd_random_transf_init0(nsteps,n,albetas,ixs)
        implicit real *8 (a-h,o-z)
        save
        dimension albetas(2,n,*),ixs(n,*)
c
c       routine idd_random_transf_init serves as a memory wrapper
c       for the present routine; please see routine
c       idd_random_transf_init for documentation.
c
        do 2000 ijk=1,nsteps
c
        call idd_random_transf_init00(n,albetas(1,1,ijk),ixs(1,ijk) )
 2000 continue
        return
        end

fortls already catches this with “Type mismatch in argument ‘ixs’ at (1); passed REAL(8) to INTEGER(4)” (Nice!)

However, what could be the motivation to do so ? I don’t think there is bit arithmetic playing here but not sure. Is this going to truncate the floating result to integer as in type casting (int)num in C-lang? I was hoping that the compilers would error out in type-checking phase but apparently not the common practice.

Appreciate the help.

1 Like

I think that the array w is simply used as storage area. Apparently, the original authors were not bothered using a separate integer work array.

Since FORTRAN 77 of old did not have any means to check the type of actual arguments versus dummy arguments, this would simply work. On the caller’s side the contents of w (as double precision numbers) would be nonsense. You do not have to worry about some type conversion - there is none in this way.

Note that real*8 and integer*4 are non-standard declarations. The standard says that a real or integer variable shall occupy one numeric storage unit (without mentioning any size), while a double precision variable shall occupy exactly 2 numeric storage unit.

In practice, with virtually all existing compilers, the numeric storage unit is 4 bytes.So, a real*8 element can be storage associated to 2 integer elements. But there is absolutely no casting of the content.

Passing arguments with type mismatchs has always been illegal in Fortran, but it was a relatively common practice because 1) it was convenient (for instance to store indifferently integer or real variable in a work array, 2) it was guaranteed to work, and 3) there was no compile-time checks for that.

Thank you both. Another thing I am seeing is indexing an array variable with seemingly floating point variables. As an example, consider this line above

       w(1)=ialbetas+0.1

This is later on used as

        subroutine idd_random_transf(x,y,w)
        implicit real *8 (a-h,o-z)
        save
        dimension x(*),y(*),w(*)
c
c       applies rapidly a random orthogonal matrix
c       to the user-specified real vector x, 
c       using the data in array w stored there by a preceding 
c       call to routine idd_random_transf_init.
c
c       input:
c       x -- the vector of length n to which the random matrix is
c            to be applied
c       w -- array containing all initialization data
c
c       output:
c       y -- the result of applying the random matrix to x
c
c
c        . . . allocate memory
c
        ialbetas=w(1)
        iixs=w(2)
        nsteps=w(3)
        iww=w(4)
        n=w(5)
c
        call idd_random_transf0(nsteps,x,y,n,w(iww),
     1      w(ialbetas),w(iixs))
c
        return
        end

ialbetas here is strange since according to the comments it should be an integer. But I can’t see the significance of 0.1. is there any similar shortcuts that people used to use?

It’s hard to tell what was the intention of the author here… My guess:

  • for some reason the author temporarily stores some indexes (which are integers by nature) in a floating point array
integer i
real*8 a
a = i   ! implicit casting
call somesub(a)
  • at some point he has to retrieve the values as integers
subroutine somesub(b)
real*8 b
integer j
j = b   ! implicit casting, equivalent to j = int(b)
  • Depending on how the processor handles the rounding, there is a risk at the end to have j == i-1. Adding 0.1 is a way to avoid such a case.

Just a guess… In practice the risk is close to zero, as an IEEE 64 bit floating points variable can store the exact representations of all the integers up to 2^{53}: I doubt the code has indexes that are larger than this value…

3 Likes

As @PierU has hypothesized, it could be due to the truncation when casting from real to integer. This is an issue also in LAPACK, where the work-size query returns a real value: Loss of precision in all single precision workspace queries, implicit and explicit · Issue #600 · Reference-LAPACK/lapack · GitHub (see the comment by @mgates3 on Jul 21, 2021)

Ah OK. That’s a bit too much for my acceptable magic level considering there is also a lot of SAVE involved.

So probably I’ll be better off recoding the algorithm and not the code flow. Thank you very much for the explanation. Off to the abyss then.

Yes we also fixed it on SciPy side. See the last entry in that thread as a link to the PR.