SAVE attribute scope

ilayn · August 22, 2024, 8:10pm

Greetings,

I am trying to understand the SAVE attribute behavior a little bit better in the classical code bases. Basically, my confusion revolves around the scope of reverse communication.

To start with a simple and small example, we have a code that has 3 functions (apologies for not making a full program as I still don’t know what the best practice is). Here is something that I will fake by parroting what I am seeing.

       subroutine entryfunc(a, b, c)
         integer stage
         save stage 
         stage = 0

         if (stage .eq. 0) then
!          inner1 sets stage to = 1
           call inner1(a, b, c, stage)
!          exit for reverse communication to do something with a,b,c
           go to 100
         else
!          Second call to entryfunc should come here.
!          because we set stage = 1
           call inner2(a, b, c, stage)
!          exit again for rev. com.
           go to 100
         endif

    100  continue
         return
       end subroutine

This is basically the minimalized problem I have with a larger code base.

That is confusing me about ARPACK. This function repeatedly exits back to Python to compute matvecs and other necessary information. But saved variables should have lost their scope, in my ignorant opinion. However since this keeps on entering to the same program, something should be setting the state of all functions from scratch. Otherwise reverse communication would be pointless.

On the other hand, these classical software typically called by programs that are often named driver.f or similar. Hence I suppose they hold the entire scope intact while repeatedly calling the bespoke entryfunc. Because then entryfunc and everything connected to it become a nested function call (I think!)

My curiosity is about when we interface from Python through f2py do we also acquire the entire scope just like a driver program such that everything else retains its value.

These lines are examples how we call the problem inside a while loop when we solve ARPACK problems

github.com

scipy/scipy/blob/c23157002050d468fc71c890cd7e8e792f0bd2e7/scipy/sparse/linalg/_eigen/arpack/arpack.py#L536-L540


      
          def iterate(self):
              self.ido, self.tol, self.resid, self.v, self.iparam, self.ipntr, self.info = \
                  self._arpack_solver(self.ido, self.bmat, self.which, self.k,
                                      self.tol, self.resid, self.v, self.iparam,
                                      self.ipntr, self.workd, self.workl, self.info)

The answer will determine whether I have to carry all saved variables of all subfunctions during reverse communication and then reset them in the next call or not since I rewrote all in C and I don’t have any driver function requirement but I just want to make sure I am not getting caught in another historical gotcha.

Concrete example

github.com

opencollab/arpack-ng/blob/0b3038d139f874a423ba6a1cdf5098fc15336848/SRC/dgetv0.f#L164


      
                parameter (one = 1.0D+0, zero = 0.0D+0)
          c
          c     %------------------------%
          c     | Local Scalars & Arrays |
          c     %------------------------%
          c
                logical    first, inits, orth
                integer    idist, iseed(4), iter, msglvl, jj
                Double precision
               &           rnorm0
                save       first, iseed, inits, iter, msglvl, orth, rnorm0
          c
          c     %----------------------%
          c     | External Subroutines |
          c     %----------------------%
          c
                external   dlarnv, dvout, dcopy, dgemv, arscnd
          c
          c     %--------------------%
          c     | External Functions |
          c     %--------------------%

dgetv0 is called by various functions but you can follow the most obvious chain. dnaupd -> dnaup2 -> dgetv0 generates a random vector to start with then exits three levels up.

Now I know we have issues with the lack of control of randomization and we reported it years ago that did not lead to any fruition e.g., Arpack random generator cannot be seeded · Issue #218 · opencollab/arpack-ng · GitHub So that’s a separate concern but I just wanted to use an example. Because this is how the same seed is used across a full run. Otherwise it would generate the same random vectors all the time. So this gives me the impression that Python call is also a driver with its while loop.

This is one of the issues I fixed by yanking out the randomization parts to a separate level independent from the original ARPACK code so we can properly seed with modern NumPy Random C-API. But I just need to make sure that I understood the fortran parts correctly.

As always, thank you for your help.

ivanpribec · August 22, 2024, 9:02pm

I’d say save is the opposite of “going out of scope”. It’s telling the compiler to preserve the value “alive” between function calls. It’s essentially part of the global state, but only visible internally within the procedure. Here’s the general concept: Static variable - Wikipedia

If it helps, you can imagine save as a static variable within a function in C:

// count.c
#include <stdio.h>

void count(int *value) 
{
   static int counter_ = 0;
   counter_++;
   if (value) *value = counter_;
}

#define nil (void *) 0

int main() {

   count(nil);
   count(nil);
   count(nil);

   int value;
   count(&value);
   printf("counter = %d\n", value);
   
   return 0;
}
}

! count.f90
program main
implicit none
integer :: value

call count()
call count()
call count()

call count(value)
print *, "count = ", value

contains

   subroutine count(value)
      integer, intent(out), optional :: value
      integer, save :: counter_ = 0
      counter_ = counter_ + 1
      if (present(value)) value = counter_
   end subroutine

end program

If you don’t want save anymore, e.g. to make the library thread-safe, or other reasons, the right approach would be to encapsulate the save’d variables into a derived type/struct, which is passed between calls.

      type :: dgetv0_state
         logical :: first, inits, orth
         integer :: idist, iseed(4), iter, msglvl, jj
         double precision :: rnorm0
      end type

      subroutine dgetv0
     &   ( ido, bmat, itry, initv, n, j, v, ldv, resid, rnorm,
     &     ipntr, workd, ierr, state )
C                              ^-- extra dummy argument
C ...
      type(dgetv0_state), intent(inout) :: state

The caller would be expected to keep the values in state intact without modification.

ilayn · August 22, 2024, 9:07pm

I can do that but what I am asking about is when we don’t call this from a program. Or think of a static C library without main. Then you don’t have the surrounding scope hence my question.

Similarly ARPACK fortran code also does not have a main but only called via f2py. So you have to strip off the program and think of calling count from outside.

ivanpribec · August 22, 2024, 9:21pm

I don’t see how that changes anything.

External Fortran procedure (no enclosing scope):

! count.f90
subroutine count(value)
   integer, intent(out), optional :: value
   integer, save :: counter_ = 0
   counter_ = counter_ + 1
   if (present(value)) value = counter_
end subroutine

Driver in C:

// main.c
#include <stdio.h>

// Fortran procedure
#define count count_
extern void count(int *);

#define nil (void *) 0

int main(void) {

	count(nil);
	count(nil);
	count(nil);

   	int value;
   	count(&value);
   	printf("counter = %d\n", value);
   
   	return 0;
}

Test run:

$ gfortran-13 -c count.f90 
$ gcc-13 main.c count.o 
$ ./a.out
counter = 4

ilayn · August 22, 2024, 9:28pm

The difference is what I am asking. I know and I did coded exactly like that. So C side is not a problem.

Currently we are calling things from Python through f2py. So concrete question is whether every call is still inside the same scope because f2py is acting like a bridge and nannies a main around the wrapped code or it ends the program everytime it bridges and creates a new scope next call.

Because essentially f2py is a C code around the fortran code with a single invocation.

I am trying to confirm my suspicion not trying to reinvent a mechanism in C.

ilayn · August 22, 2024, 9:34pm

But come to think of it, Python interpreter becomes the main in this case. So to reset things, we have to force-reimport the library. I guess that is what it is doing.

ivanpribec · August 22, 2024, 9:36pm

I only used f2py once, but my naive understanding is it creates a Python extension module which is effectively a shared library (.dll, dylib, .so). If I’m not mistaken, the static variables are created when the shared library is loaded into the address space of the process.

ivanpribec · August 22, 2024, 9:38pm

Yes, that would match my understanding. If there is save’d state, e.g. from the random number generator, the only way to reset that variable would be to reimport the library. Of course you could also extend the procedure by adding a new ido/stage case to the reverse communication procedure which does the resetting for you.

RonShepard · August 22, 2024, 10:03pm

This code is probably not doing what you want it to do. If you want the values of stage to be retained and reused on subsequent calls, then you do NOT want the stage=0 assignment statement to be executed on every call. Instead, do something like

   integer, save :: stage = 0

This tells the compiler to 1) save the value for subsequent calls, and 2) initialize (in the fortran sense) the value to 0. That value of zero will be available to the subroutine in the first call, and it can then be reset as appropriate for subsequent calls.

As the code is written, the save attribute doesn’t really accomplish anything because stage has only local scope (no other routine can see it) and because it is reset anew every time the subroutine is called.

I should perhaps mention that the save attribute is redundant in the above code, but most programmers (including me) prefer to include it explicitly nonetheless just to avoid any confusion about the programmer’s intent.

ilayn · August 22, 2024, 10:42pm

Yes quite possible. I just made it up on the fly without paying too much attention. Thank you both.

ivanpribec · August 23, 2024, 12:56am

To me at least, static variables made much more sense after I wrote my little CHIP-8 emulator.

Like everything in a Von Neumann architecture, a static variable is just a chunk of memory. Nowadays it is more complicated, because of the mapping into the address space, loading at run-time, etc.

Counter with static variable in CHIP-8 assembly

; CHIP-8 uses 16 registers, named V0 - VF (hex)
; The value I is a 16-bit address pointer

main:
    LD VA, 0        ; VA = 0
    CALL count
    CALL count
    CALL count
    LD VA, 1        ; VA = 1, retrieve count,
    CALL count      ; result is found in VB 
    LD V3, 2
    LD V4, 2
    LD V5, VB       
    call write      ; Write count to screen
.fin:
    JP .fin

; Count number of function calls using a 
; static variable
;
; Arguments:
;   VA ... if 1, write output value
;   VB ... output value
;
count:
    JP .body        ; Jump to executable part 
.counter:           ; Counter variable, reserve
    #d8 0x0         ;    1 byte of memory              <-- static variable
.body:
    LD I, .counter  ; Load address of counter
    LD V0, [I]      ; Load counter into register
    ADD V0, 1       ; Increment
    LD I, .counter  ; Re-load address
    LD [I], V0      ; Store counter
    SE VA, 1        ; Skip if VA == 1
    JP .end
    LD VB, V0       ; VB = V0
.end:
    RET

write:
   ; ... omitted ...

It’s kind of wonky because the instruction set is very primitive, I’m using a ad-hoc calling convention, and instructions and data share the same address space without any form of protection.

After compilation it becomes a binary blob (shown here as a hex dump); the static (or saved in Fortran parlance) variable is the one in column 9, row 2, that is 00, just below 6a.

                                vv
 00 | 6a 00 22 16  22 16 22 16  6a 01 22 16  63 02 64 02 | j.".".".j.".c.d. |
 10 | 85 b0 22 2b  12 14 12 19  00 a2 18 f0  65 70 01 a2 | .."+........ep.. | <--
 20 | 18 f0 55 3a  01 12 29 8b  00 00 ee a2  6a d3 45 73 | ..U:..).....j.Es |
 30 | 06 a2 6f d3  45 73 06 a2  74 d3 45 73  06 a2 79 d3 | ..o.Es..t.Es..y. |
 40 | 45 73 06 a2  7e d3 45 73  06 a2 83 d3  45 73 06 a2 | Es..~.Es....Es.. |
 50 | 67 f5 33 f2  65 f0 29 d3  45 73 05 f1  29 d3 45 73 | g.3.e.).Es..).Es |
 60 | 05 f2 29 d3  45 00 ee 00  00 00 f8 80  80 80 f8 f8 | ..).E........... |
 70 | 88 88 88 f8  88 88 88 88  f8 88 c8 a8  98 88 f8 20 | ...............  |
 80 | 20 20 20 00  f8 00 f8 00  .. .. .. ..  .. .. .. .. |    ............. |

After running in the emulator, it works like intended:

The blog posts from Raymond Chen from Microsoft are also a gold mine of knowledge on memory organization and argument passing conventions:

As Raymond nicely illustrates in the second link, in the past everything including local variables would have static storage duration. There are still ways to force this behavior, e.g. using -fno-automatic or certain DEC extensions.

Edit: after sleeping over this, I began to ponder if any compilers store static variables within the instructions themselves making use of self-modifying code:

; Increment counter (uses self-modifying code) 
; Set VA = 1 beforehand to retrieve it. Output in VB.
; The counter is stored as an immediate value in the 
; instructions that loads it
count:
    LD I, .load ; Load address
    LD V1, [I]  ; Load V0 and V1 with contents at I and I+1
    ADD V1, 1   ; Add 1 to counter
    LD I, .load ; Re-load address
    LD [I], V1  ; Store V0 and V1 at I and I+1
    SNE VA, 1   ; Skip if VA != 1
.load:
    LD VB, 0x0  ; Retrieve counter (stored in this immediate value)
    RET

certik · August 23, 2024, 2:47am

@ilayn judging from your frequent posts here, by the time you are done succeeding at your goal of removing all of Fortran from SciPy, you will end up quite a Fortran expert. If you end up being sad and want to contribute to some Fortran projects (no longer in SciPy), you will always be welcome to send us PRs to the many Fortran projects here.

ilayn · August 23, 2024, 6:52am

Ha! I already started

github.com/fortran-lang/vscode-fortran-support

bug(syntax): F77 code highlighting cannot resolve a bare statement with line continuation

opened 08:22PM - 10 Aug 24 UTC

ilayn

syntax-highlight

### Is there an existing issue for this? - [X] I have searched the existing iss…ues ### Description If the file https://github.com/scipy/scipy/blob/maintenance/1.14.x/scipy/optimize/lbfgsb_src/lbfgsb.f is opened in VS Code, the syntax highlighting comes quite nice all the way down to the line 967 and then gets confused on a naked `format` directive and cannot recover until the end of the file. On my system, it looks like the screenshot pasted below. However, if I transfer a single character from 968 to 967 so format has at least something in the same line, it manages to recover. This leads me to believe that there is a greedy search for an argument after format line but cannot resolve the line continuation context. ### Screenshots ![image](https://github.com/user-attachments/assets/0344182b-bec9-427f-bfe3-6efa7e1a8a6f) ![image](https://github.com/user-attachments/assets/7b81939a-5440-4b33-89c6-e0081f391c8a) ### Code Example ```Fortran ! Code above snippped dsave(14) = stp dsave(15) = gdold dsave(16) = dtd 1001 format (//,'ITERATION ',i5) 1002 format + (/,'At iterate',i5,4x,'f= ',1p,d12.5,4x,'|proj g|= ',1p,d12.5) 1003 format (2(1x,i4),5x,'-',5x,'-',3x,'-',5x,'-',5x,'-',8x,'-',3x, + 1p,2(1x,d10.3)) 1004 format (' ys=',1p,e10.3,' -gs=',1p,e10.3,' BFGS update SKIPPED') 1005 format (/, +' Singular triangular system detected;',/, +' refresh the lbfgs memory and restart the iteration.') 1006 format (/, +' Nonpositive definiteness in Cholesky factorization in formk;',/, +' refresh the lbfgs memory and restart the iteration.') 1007 format (/, +' Nonpositive definiteness in Cholesky factorization in formt;',/, +' refresh the lbfgs memory and restart the iteration.') 1008 format (/, +' Bad direction in the line search;',/, +' refresh the lbfgs memory and restart the iteration.') return ! Code below snippped ``` ### Fortran Version (Free/Fixed Form) Fixed Form e.g. F77 ### Version of Modern Fortran v3.4.2024061901 (pre-release) ### Version of Visual Studio Code v1.92.0 ### Platform and Architecture Windows 10 ### Additional Information _No response_

ilayn · August 23, 2024, 7:36am

In this translation, we cannot use static just as we cannot use SAVE in Fortran because there is no single user call guarantee. Users (already) want to be able to call this code from many threads. Hence static vars will have races. So everything is going to be stored in a struct that I am passing around by reference to keep the state. There are about 50ish variables in total. Still very very ugly but necessary evil unless we want to rewrite the whole thing from scratch. Unfortunately I don’t have such capacity currently. But maybe in the future.

Arjen · August 23, 2024, 9:15am

One thing about SAVEd variables I did not see in this conversation is this:

integer :: counter = 0

versus the C fragment:

int counter = 0;

either one appearing in a function/routine.

There is an essential difference here between Fortran and C:

In C the variable would be set to zero on every entry to the function/routine.
In Fortran the variable automatically (!) gets the SAVE attribute and has the value zero right from the start of the program/DLL/SO, without the containing function/routine having been called. The variable is NOT set to zero when you enter the function/routine. (At least that is my understanding )
The Fortran fragment is equivalent to:

static int counter = 0;

It is probably best to explicitly use the SAVE attribute on such variables, if only that the SAVE attribute then documents that fact.

(I hope this is not causing confusion.)

ivanpribec · August 23, 2024, 9:32am

That sounds like a good plan. Looking forward to the result.

In case of multi-threading, my understanding is that often you’d like the matvec operation to be done in parallel. SciPy has lacked this so far, but I see there is work on it now: ENH: Parallelize CSR (sparse matrix) methods by alugowski · Pull Request #19717 · scipy/scipy · GitHub.

The second-case is when you have multiple eigenvalue problems (that is matrices), and you’d like find the eigenvalues of each of them in parallel. To make this possible, thread-safe ARPACK procedures are needed, which is currently not the case. I’m guessing one can still use multi-processing here.

I’m wondering how would the two different modes of parallelism interact?

Btw, once you have the C library in-place, you can try refactor it using semantic patches: Coccinelle: A Program Matching and Transformation Tool for Systems Code. This is sort of like a “search and replace” on steroids. In the reverse directions, Fortranners can use it to generate interfaces from the C header.

ilayn · August 23, 2024, 9:33am

Ah you are too kind but two months late, since it caused the confusion and costed me a week. Here is my confusion in text format

ilayn · August 23, 2024, 9:37am

My guess is that it will be due to mostly mutual exclusivity. Because after reverse communication the control is completely passed to Python side and hence matvec. I don’t know what will happen if matvec code also calls ARPACK code but at some point my nano brain cannot handle that much complication.

The random C-API of NumPy has built in properties for parallelism so if I invoke correctly it should go through. The rest of the code is now stateless so on paper it should be safe.

jacobwilliams · August 25, 2024, 3:30pm

It does fill me with sadness to see these old Fortran libraries being converted to C. C is just not the numerical programming language of the future. We have shown how to modernize the old Fortran 77 codes to bring them up to date with modern concepts (object oriented, thread safe, etc.) For SciPy, all that was needed was someone over the past 20 years to realize that this could be done, but I guess it never was. A huge missed opportunity for the Fortran community. Fortran usage in SciPy could have been a great driver for improving modern Fortran tooling (beyond the f2py hack). I guess it is too late now? I feel like we are improving these things slowing but surely, but without SciPy we lose a huge potential pool of users and developers.

ilayn · August 25, 2024, 4:38pm

100% agree. But if you want to wait for that silver bullet to arrive, C is the seemingly only candidate to archive things. There are at least 4 open-source competitive languages that can consume native C code. In fact I am using Zig to compile my C code personally. Bear with me to make a point below.

Again fully agree. I think your codebase should replace the old versions in all tools including R and Octave and any other libraries wrapping these old things.

True but this is a very important sentence to ponder because that’s exactly what we did for a few years, at least I did, before I decided to kill my evenings and weekends with this silly crusade of mine about the code within SciPy. Why did not folks rise up to the task of touching these codebase while every corner of the world is booming with open-source contributors, from web frameworks to ML. This is a fundamental point to think about by the powers of Fortran that be and maybe the LFortran team, obviously not in this issue but also maybe an internal discussion. Note that I am not using any features or shortcomings of Fortran old or new. Just the concept itself; attracting changes or modernization.

I don’t consider languages as separate islands. That’s somehow a fortranism, that is users of fortran become somehow very emotional about this tool. I will never get it but fine it is what it is.

It is not late at all, because the final level boss is just sitting there. And I do believe that the scientific community should do something about it as soon as possible; that is LAPACK.

I have been talking to many folks, compiler designers and others, to find something to convert LAPACK into some array language that is not F77. You folks and I clearly have different ideas about modern fortran. That’s perfectly fine. This should not hinder any individual efforts, you do the new fortran port I’ll do something else. Every bit of that effort is valuable.

I am planning to give a talk sometime somewhere about this and create a very mild call for action. You can see my unpolished quickfire summary in Rewrite LAPACK so it's not in F77 · Issue #7 · scientific-python/faster-scientific-python-ideas · GitHub I hope folks will give me the benefit of doubt before start defending Fortran. This is independent from the language but to save all that nuggets of gold and diamond to the next level with LAPACK which is our collective linear algebra knowledge.

A year ago, I have been receiving, “that’s impossible, one person can’t read all that code” comments left and right almost every week, now we are talking about “maybe we should do something about LAPACK” and I think that is progress. Lfortran is clearly another wonderful thing. These should not be mutually exclusive. The ultimate goal is to remove the fixed format spaghetti code from scientific domain.

The more the merrier whether through modernized fortran or some other; ideally both simultaneously.

But please do something about this multiline comment topic. I’m tired of regexing 150000 lines of code

Topic		Replies	Views
Using reserved words as variables Help	61	2132	September 4, 2023
The difficulties of using Fortran in SciPy	98	5173	April 13, 2024
LAPACK interfaces	45	3229	August 21, 2023
"A Perspective on Sustainable Computational Chemistry Software Development and Integration": doutbtful comments about Fortran Advocacy	15	1005	March 7, 2024
Packs -- the classic Fortran libraries	32	3684	February 14, 2022

SAVE attribute scope

Related topics