SAVE attribute scope

Greetings,

I am trying to understand the SAVE attribute behavior a little bit better in the classical code bases. Basically, my confusion revolves around the scope of reverse communication.

To start with a simple and small example, we have a code that has 3 functions (apologies for not making a full program as I still don’t know what the best practice is). Here is something that I will fake by parroting what I am seeing.

       subroutine entryfunc(a, b, c)
         integer stage
         save stage 
         stage = 0

         if (stage .eq. 0) then
!          inner1 sets stage to = 1
           call inner1(a, b, c, stage)
!          exit for reverse communication to do something with a,b,c
           go to 100
         else
!          Second call to entryfunc should come here.
!          because we set stage = 1
           call inner2(a, b, c, stage)
!          exit again for rev. com.
           go to 100
         endif

    100  continue
         return
       end subroutine

This is basically the minimalized problem I have with a larger code base.

That is confusing me about ARPACK. This function repeatedly exits back to Python to compute matvecs and other necessary information. But saved variables should have lost their scope, in my ignorant opinion. However since this keeps on entering to the same program, something should be setting the state of all functions from scratch. Otherwise reverse communication would be pointless.

On the other hand, these classical software typically called by programs that are often named driver.f or similar. Hence I suppose they hold the entire scope intact while repeatedly calling the bespoke entryfunc. Because then entryfunc and everything connected to it become a nested function call (I think!)

My curiosity is about when we interface from Python through f2py do we also acquire the entire scope just like a driver program such that everything else retains its value.

These lines are examples how we call the problem inside a while loop when we solve ARPACK problems

The answer will determine whether I have to carry all saved variables of all subfunctions during reverse communication and then reset them in the next call or not since I rewrote all in C and I don’t have any driver function requirement but I just want to make sure I am not getting caught in another historical gotcha.


Concrete example

dgetv0 is called by various functions but you can follow the most obvious chain. dnaupd -> dnaup2 -> dgetv0 generates a random vector to start with then exits three levels up.

Now I know we have issues with the lack of control of randomization and we reported it years ago that did not lead to any fruition e.g., Arpack random generator cannot be seeded · Issue #218 · opencollab/arpack-ng · GitHub So that’s a separate concern but I just wanted to use an example. Because this is how the same seed is used across a full run. Otherwise it would generate the same random vectors all the time. So this gives me the impression that Python call is also a driver with its while loop.

This is one of the issues I fixed by yanking out the randomization parts to a separate level independent from the original ARPACK code so we can properly seed with modern NumPy Random C-API. But I just need to make sure that I understood the fortran parts correctly.

As always, thank you for your help.

I’d say save is the opposite of “going out of scope”. It’s telling the compiler to preserve the value “alive” between function calls. It’s essentially part of the global state, but only visible internally within the procedure. Here’s the general concept: Static variable - Wikipedia

If it helps, you can imagine save as a static variable within a function in C:

// count.c
#include <stdio.h>

void count(int *value) 
{
   static int counter_ = 0;
   counter_++;
   if (value) *value = counter_;
}

#define nil (void *) 0

int main() {

   count(nil);
   count(nil);
   count(nil);

   int value;
   count(&value);
   printf("counter = %d\n", value);
   
   return 0;
}
}
! count.f90
program main
implicit none
integer :: value

call count()
call count()
call count()

call count(value)
print *, "count = ", value

contains

   subroutine count(value)
      integer, intent(out), optional :: value
      integer, save :: counter_ = 0
      counter_ = counter_ + 1
      if (present(value)) value = counter_
   end subroutine

end program

If you don’t want save anymore, e.g. to make the library thread-safe, or other reasons, the right approach would be to encapsulate the save’d variables into a derived type/struct, which is passed between calls.

      type :: dgetv0_state
         logical :: first, inits, orth
         integer :: idist, iseed(4), iter, msglvl, jj
         double precision :: rnorm0
      end type

      subroutine dgetv0
     &   ( ido, bmat, itry, initv, n, j, v, ldv, resid, rnorm,
     &     ipntr, workd, ierr, state )
C                              ^-- extra dummy argument
C ...
      type(dgetv0_state), intent(inout) :: state

The caller would be expected to keep the values in state intact without modification.

1 Like

I can do that but what I am asking about is when we don’t call this from a program. Or think of a static C library without main. Then you don’t have the surrounding scope hence my question.

Similarly ARPACK fortran code also does not have a main but only called via f2py. So you have to strip off the program and think of calling count from outside.

I don’t see how that changes anything.

External Fortran procedure (no enclosing scope):

! count.f90
subroutine count(value)
   integer, intent(out), optional :: value
   integer, save :: counter_ = 0
   counter_ = counter_ + 1
   if (present(value)) value = counter_
end subroutine

Driver in C:

// main.c
#include <stdio.h>

// Fortran procedure
#define count count_
extern void count(int *);

#define nil (void *) 0

int main(void) {

	count(nil);
	count(nil);
	count(nil);

   	int value;
   	count(&value);
   	printf("counter = %d\n", value);
   
   	return 0;
}

Test run:

$ gfortran-13 -c count.f90 
$ gcc-13 main.c count.o 
$ ./a.out
counter = 4

The difference is what I am asking. I know and I did coded exactly like that. So C side is not a problem.

Currently we are calling things from Python through f2py. So concrete question is whether every call is still inside the same scope because f2py is acting like a bridge and nannies a main around the wrapped code or it ends the program everytime it bridges and creates a new scope next call.

Because essentially f2py is a C code around the fortran code with a single invocation.

I am trying to confirm my suspicion not trying to reinvent a mechanism in C.

But come to think of it, Python interpreter becomes the main in this case. So to reset things, we have to force-reimport the library. I guess that is what it is doing.

I only used f2py once, but my naive understanding is it creates a Python extension module which is effectively a shared library (.dll, dylib, .so). If I’m not mistaken, the static variables are created when the shared library is loaded into the address space of the process.

Yes, that would match my understanding. If there is save’d state, e.g. from the random number generator, the only way to reset that variable would be to reimport the library. Of course you could also extend the procedure by adding a new ido/stage case to the reverse communication procedure which does the resetting for you.

This code is probably not doing what you want it to do. If you want the values of stage to be retained and reused on subsequent calls, then you do NOT want the stage=0 assignment statement to be executed on every call. Instead, do something like

   integer, save :: stage = 0

This tells the compiler to 1) save the value for subsequent calls, and 2) initialize (in the fortran sense) the value to 0. That value of zero will be available to the subroutine in the first call, and it can then be reset as appropriate for subsequent calls.

As the code is written, the save attribute doesn’t really accomplish anything because stage has only local scope (no other routine can see it) and because it is reset anew every time the subroutine is called.

I should perhaps mention that the save attribute is redundant in the above code, but most programmers (including me) prefer to include it explicitly nonetheless just to avoid any confusion about the programmer’s intent.

1 Like

Yes quite possible. I just made it up on the fly without paying too much attention. Thank you both.

To me at least, static variables made much more sense after I wrote my little CHIP-8 emulator.

Like everything in a Von Neumann architecture, a static variable is just a chunk of memory. Nowadays it is more complicated, because of the mapping into the address space, loading at run-time, etc.

Counter with static variable in CHIP-8 assembly
; CHIP-8 uses 16 registers, named V0 - VF (hex)
; The value I is a 16-bit address pointer

main:
    LD VA, 0        ; VA = 0
    CALL count
    CALL count
    CALL count
    LD VA, 1        ; VA = 1, retrieve count,
    CALL count      ; result is found in VB 
    LD V3, 2
    LD V4, 2
    LD V5, VB       
    call write      ; Write count to screen
.fin:
    JP .fin

; Count number of function calls using a 
; static variable
;
; Arguments:
;   VA ... if 1, write output value
;   VB ... output value
;
count:
    JP .body        ; Jump to executable part 
.counter:           ; Counter variable, reserve
    #d8 0x0         ;    1 byte of memory              <-- static variable
.body:
    LD I, .counter  ; Load address of counter
    LD V0, [I]      ; Load counter into register
    ADD V0, 1       ; Increment
    LD I, .counter  ; Re-load address
    LD [I], V0      ; Store counter
    SE VA, 1        ; Skip if VA == 1
    JP .end
    LD VB, V0       ; VB = V0
.end:
    RET

write:
   ; ... omitted ...

It’s kind of wonky because the instruction set is very primitive, I’m using a ad-hoc calling convention, and instructions and data share the same address space without any form of protection.

After compilation it becomes a binary blob (shown here as a hex dump); the static (or saved in Fortran parlance) variable is the one in column 9, row 2, that is 00, just below 6a.

                                vv
 00 | 6a 00 22 16  22 16 22 16  6a 01 22 16  63 02 64 02 | j.".".".j.".c.d. |
 10 | 85 b0 22 2b  12 14 12 19  00 a2 18 f0  65 70 01 a2 | .."+........ep.. | <--
 20 | 18 f0 55 3a  01 12 29 8b  00 00 ee a2  6a d3 45 73 | ..U:..).....j.Es |
 30 | 06 a2 6f d3  45 73 06 a2  74 d3 45 73  06 a2 79 d3 | ..o.Es..t.Es..y. |
 40 | 45 73 06 a2  7e d3 45 73  06 a2 83 d3  45 73 06 a2 | Es..~.Es....Es.. |
 50 | 67 f5 33 f2  65 f0 29 d3  45 73 05 f1  29 d3 45 73 | g.3.e.).Es..).Es |
 60 | 05 f2 29 d3  45 00 ee 00  00 00 f8 80  80 80 f8 f8 | ..).E........... |
 70 | 88 88 88 f8  88 88 88 88  f8 88 c8 a8  98 88 f8 20 | ...............  |
 80 | 20 20 20 00  f8 00 f8 00  .. .. .. ..  .. .. .. .. |    ............. |

After running in the emulator, it works like intended:

image

The blog posts from Raymond Chen from Microsoft are also a gold mine of knowledge on memory organization and argument passing conventions:

As Raymond nicely illustrates in the second link, in the past everything including local variables would have static storage duration. There are still ways to force this behavior, e.g. using -fno-automatic or certain DEC extensions.

Edit: after sleeping over this, I began to ponder if any compilers store static variables within the instructions themselves making use of self-modifying code:

; Increment counter (uses self-modifying code) 
; Set VA = 1 beforehand to retrieve it. Output in VB.
; The counter is stored as an immediate value in the 
; instructions that loads it
count:
    LD I, .load ; Load address
    LD V1, [I]  ; Load V0 and V1 with contents at I and I+1
    ADD V1, 1   ; Add 1 to counter
    LD I, .load ; Re-load address
    LD [I], V1  ; Store V0 and V1 at I and I+1
    SNE VA, 1   ; Skip if VA != 1
.load:
    LD VB, 0x0  ; Retrieve counter (stored in this immediate value)
    RET

@ilayn judging from your frequent posts here, by the time you are done succeeding at your goal of removing all of Fortran from SciPy, you will end up quite a Fortran expert. :slight_smile: If you end up being sad and want to contribute to some Fortran projects (no longer in SciPy), you will always be welcome to send us PRs to the many Fortran projects here. :slight_smile:

5 Likes

Ha! I already started :wink:

1 Like

In this translation, we cannot use static just as we cannot use SAVE in Fortran because there is no single user call guarantee. Users (already) want to be able to call this code from many threads. Hence static vars will have races. So everything is going to be stored in a struct that I am passing around by reference to keep the state. There are about 50ish variables in total. Still very very ugly but necessary evil unless we want to rewrite the whole thing from scratch. Unfortunately I don’t have such capacity currently. But maybe in the future.

One thing about SAVEd variables I did not see in this conversation is this:

integer :: counter = 0

versus the C fragment:

int counter = 0;

either one appearing in a function/routine.

There is an essential difference here between Fortran and C:

  • In C the variable would be set to zero on every entry to the function/routine.
  • In Fortran the variable automatically (!) gets the SAVE attribute and has the value zero right from the start of the program/DLL/SO, without the containing function/routine having been called. The variable is NOT set to zero when you enter the function/routine. (At least that is my understanding :slight_smile:)
  • The Fortran fragment is equivalent to:
static int counter = 0;
  • It is probably best to explicitly use the SAVE attribute on such variables, if only that the SAVE attribute then documents that fact.

(I hope this is not causing confusion.)

1 Like

That sounds like a good plan. Looking forward to the result. :+1:

In case of multi-threading, my understanding is that often you’d like the matvec operation to be done in parallel. SciPy has lacked this so far, but I see there is work on it now: ENH: Parallelize CSR (sparse matrix) methods by alugowski · Pull Request #19717 · scipy/scipy · GitHub.

The second-case is when you have multiple eigenvalue problems (that is matrices), and you’d like find the eigenvalues of each of them in parallel. To make this possible, thread-safe ARPACK procedures are needed, which is currently not the case. I’m guessing one can still use multi-processing here.

I’m wondering how would the two different modes of parallelism interact?

Btw, once you have the C library in-place, you can try refactor it using semantic patches: Coccinelle: A Program Matching and Transformation Tool for Systems Code. This is sort of like a “search and replace” on steroids. In the reverse directions, Fortranners can use it to generate interfaces from the C header.

1 Like

Ah you are too kind but two months late, since it caused the confusion and costed me a week. Here is my confusion in text format :sweat_smile:

My guess is that it will be due to mostly mutual exclusivity. Because after reverse communication the control is completely passed to Python side and hence matvec. I don’t know what will happen if matvec code also calls ARPACK code but at some point my nano brain cannot handle that much complication.

The random C-API of NumPy has built in properties for parallelism so if I invoke correctly it should go through. The rest of the code is now stateless so on paper it should be safe.

It does fill me with sadness to see these old Fortran libraries being converted to C. C is just not the numerical programming language of the future. We have shown how to modernize the old Fortran 77 codes to bring them up to date with modern concepts (object oriented, thread safe, etc.) For SciPy, all that was needed was someone over the past 20 years to realize that this could be done, but I guess it never was. A huge missed opportunity for the Fortran community. Fortran usage in SciPy could have been a great driver for improving modern Fortran tooling (beyond the f2py hack). I guess it is too late now? I feel like we are improving these things slowing but surely, but without SciPy we lose a huge potential pool of users and developers.

2 Likes

100% agree. But if you want to wait for that silver bullet to arrive, C is the seemingly only candidate to archive things. There are at least 4 open-source competitive languages that can consume native C code. In fact I am using Zig to compile my C code personally. Bear with me to make a point below.

Again fully agree. I think your codebase should replace the old versions in all tools including R and Octave and any other libraries wrapping these old things.

True but this is a very important sentence to ponder because that’s exactly what we did for a few years, at least I did, before I decided to kill my evenings and weekends with this silly crusade of mine about the code within SciPy. Why did not folks rise up to the task of touching these codebase while every corner of the world is booming with open-source contributors, from web frameworks to ML. This is a fundamental point to think about by the powers of Fortran that be and maybe the LFortran team, obviously not in this issue but also maybe an internal discussion. Note that I am not using any features or shortcomings of Fortran old or new. Just the concept itself; attracting changes or modernization.

I don’t consider languages as separate islands. That’s somehow a fortranism, that is users of fortran become somehow very emotional about this tool. I will never get it but fine it is what it is.

It is not late at all, because the final level boss is just sitting there. And I do believe that the scientific community should do something about it as soon as possible; that is LAPACK.

I have been talking to many folks, compiler designers and others, to find something to convert LAPACK into some array language that is not F77. You folks and I clearly have different ideas about modern fortran. That’s perfectly fine. This should not hinder any individual efforts, you do the new fortran port I’ll do something else. Every bit of that effort is valuable.

I am planning to give a talk sometime somewhere about this and create a very mild call for action. You can see my unpolished quickfire summary in Rewrite LAPACK so it's not in F77 · Issue #7 · scientific-python/faster-scientific-python-ideas · GitHub I hope folks will give me the benefit of doubt before start defending Fortran. This is independent from the language but to save all that nuggets of gold and diamond to the next level with LAPACK which is our collective linear algebra knowledge.

A year ago, I have been receiving, “that’s impossible, one person can’t read all that code” comments left and right almost every week, now we are talking about “maybe we should do something about LAPACK” and I think that is progress. Lfortran is clearly another wonderful thing. These should not be mutually exclusive. The ultimate goal is to remove the fixed format spaghetti code from scientific domain.

The more the merrier whether through modernized fortran or some other; ideally both simultaneously.

But please do something about this multiline comment topic. I’m tired of regexing 150000 lines of code :stuck_out_tongue_winking_eye:

1 Like