New SPAG restructuring options

I’d welcome feedback on some new stuff in a development version of SPAG which is currently on the stocks. There are 2 new restructuring mechanisms which, in conjunction with the existing facilities, remove all(*) GOTO constructs in even the most convoluted legacy code.

In the first construct, the source code is divided into separately addressable blocks, which are executed in the correct sequence under control of a dispatch loop.

DispatchLoop : DO
   SELECT CASE (DispatchVariable)
   CASE (1)
   	block 1
   CASE (2)
      block 2
   CASE (3)
      block 3
      ...
   END SELECT
ENDDO DispatchLoop

GOTOs are replaced by an appropriate assignment to DispatchVariable followed by CYCLE DispatchLoop. DO loops are treated as unbreakable units, but each DO may contain its own dispatch loop, which occasionally leads to nested dispatch loops. I think that overall, it works well, and the imposed structure often helps to clarify what the code is doing.

The second construct is similar, but in this case the code blocks are placed in recursive internal subroutines, and GOTOs are replaced by calls to those routines, followed by a RETURN to ensure the recursion unwinds correctly.

This second approach can lead to very deep recursion, and that worries me. However, no data is passed with each new call (the internal subroutines have no arguments or local variables). To date, we have seen no noticeable degradation in performance or memory use using this approach, but in one case we had to make a modest increase to the stack size. In simple cases, for example when different strands use the same tidying up code, this approach seems ideal, but for more complex cases, I think I find it less clear than using dispatch loops. I also worry that compilers may baulk at it, or produce code that is likely to run out of resources at runtime. Many more tests, including a huge F77 engineering code, are planned.

What do you guys think - which approach do you prefer (if either)? I’ve attached an example of a very convoluted program for solving chess problems, and the output produced using the the current release (v0) and with the above new features (v1 and v2).

(*) Either approach works well for the most common spaghetti Fortran constructs (GOTO, arithmetic IF, computed GOTO), but not so easily with ASSIGNed GOTO, alternate returns, and END= etc in I/O statements.

I’m surprised no one has replied yet, but my preference would be for v2, with one change: that the contents of block_1 are in the host. For example:

!*==IUNITS.f90  processed by SPAG 8.00DC at 16:15 on  7 Aug 2023
      FUNCTION iunits(Jdif,Kdif,Ju,Ku,Ip)
        IMPLICIT NONE
        INTEGER :: iunits
!
! Dummy argument declarations rewritten by SPAG
!
        INTEGER , INTENT(IN) :: Jdif
        INTEGER , INTENT(IN) :: Kdif
        INTEGER , INTENT(INOUT) :: Ju
        INTEGER , INTENT(OUT) :: Ku
        INTEGER , INTENT(OUT) :: Ip
!
! End of declarations rewritten by SPAG
!
        iunits = 1
        IF ( Jdif<0 ) THEN
            Ju = -1
            IF ( Kdif<0 ) THEN
            ELSEIF ( Kdif==0 ) THEN
                CALL block_2
                RETURN
            ELSE
                CALL block_3
                RETURN
            ENDIF
        ELSEIF ( Jdif==0 ) THEN
            Ju = 0
            Ku = Kdif/iabs(Kdif)
            Ip = 4
            RETURN
        ELSE
            Ju = 1
            IF ( Kdif<0 ) THEN
                CALL block_3
                RETURN
            ENDIF
            IF ( Kdif==0 ) THEN
                CALL block_2
                RETURN
            ENDIF
        ENDIF
        IF ( Jdif/=Kdif ) THEN
            iunits = 0
            RETURN
        ELSE
            Ku = Ju
            Ip = 3
            RETURN
        ENDIF
     CONTAINS
        RECURSIVE SUBROUTINE block_2
           Ku = 0
           Ip = 4
           RETURN
        END SUBROUTINE block_2
        RECURSIVE SUBROUTINE block_3
           IF ( Jdif+Kdif/=0 ) THEN
              iunits = 0
           ELSE
              Ku = -Ju
              Ip = 3
              RETURN
           ENDIF
        END SUBROUTINE block_3
     END FUNCTION iunits

This code reminds me of programming with Alliant Fortran in the 1980s. They were one of the early parallel computers, and their specialty was compiler-driven, loop-based, shared-memory parallelism, more or less like OpenMP is today. Some parallelism was automatically recognized by the compiler, and some required compiler directives. In one of my codes, I had a bunch of scalar code, and I knew from data independence that it could be computed in parallel, but there was no way for me to tell the compiler, so I asked one of the technical support people how it could be done. He thought about it for a couple of days, and then he came back with that kind of solution. I broke up the code into separate if-then-else blocks (this was before select case), put them in a do loop, and added the compiler directive to force the loop to operate in parallel. That effectively gave me parallel execution of the separate scalar blocks.

Yes - it works well in this case, and I think it reflects the intention of the original programmer. You don’t even need RECURSIVE.

I’ve been experimenting with a test case where, depending on the value of dtarg, the recursive subroutine solution leads to arbitrarily deep recursion. I have to increase the stack to the limit to get far, and eventually (at a depth of around 50-100 million, the program melts down. I suppose that by then, the stack is full of return instructions. I don’t think SPAG is clever enough to know whether that will happen, so we’d need some careful caveats.

This feels like one of those places where the halting problem unexpectedly shows up, if you really want to handle the completely general case. As usual, the solution is to tell the user (via the docs) not to try solving the halting problem!

I have tried out the new version of SPAG, V. 8.01DH on a few (about a dozen) of the ACM TOMS packages from the 1970s and 1980s. In particular, I requested the option 4=76, which causes SPAG to replace nests of GOTO statements by “dispatch loops”. In most cases, the conversion succeeded, and the process is invigorating.

In a couple of instances, the original code assumed implicit SAVE and initialize-to-zero for all variables (as was common on old mainframe Fortran systems), and it took a little bit of work to find which local variables had to be given the SAVE attribute explicitly to allow the program to run on current PCs.

I wish to report here a case where SPAG produced converted source code that, when compiled and run, seemed to terminate in an endless loop. The case is ACM TOMS 534, which is for integrating stiff ODEs, subroutine STINT . The code (driver+solver) is about 1100 lines of Fortran, with about 70 GOTO statements and about 120 labelled statements.

The attached zip file (please remove the “.txt” suffix before extracting contents) contains

  • The TOMS source code fixed up to enable compiling and running with gFortran

  • A short text file describing the changes made, compiler flags used, and flags given to SPAG, and instructions to reproduce the endless loop

  • The F90 files generated by SPAG on my PC

You may notice that the test code has a high degree of “knottiness”, and I am not asking @apple3feet to do anything about this issue unless it catches his interest.
t534.zip.txt (31.6 KB)

Thanks for reporting this. I appreciate the efforts of users reporting bugs (especially when there is sufficient information to reproduce them!) I have fixed this one in our development code, and I expect to release a point update which includes the fix in the new year.

2 Likes

Version 8.02 of plusFORT is now available at Download Current Version - Fortran UK. This fixes your bug and some others


Version 8.02A January 2024

  • Declaration rewriting is no longer disabled if there are ENTRY
    statements with no arguments.
  • Conversion of bare COMMON statements to MODULES is no longer
    enabled by adding 40 to item 4 of the SPAG configuration, unless
    item 13 is also set greater than 0. Conversion of INCLUDE files
    is not affected.
  • SPAG has new check for EQUIVALENCE involving variables inside and
    outside INCLUDE file.
  • Fixed bug which could cause looping in complex nested dispatch
    loops.
  • Fixed bug when item 238 - prefix for names invented by SPAG - was
    more than 7 characters long.
  • Fixed bug with fixed source form input files when input file
    reordering is switched off.
  • Fixed bug in “Module Maker” mode with multiple inputs.
  • SPAG closing summary gives a clearer indication of errors.
  • The HyperKWIC manual has been integrated into the plusFORT
    manual.
  • Manual updated to revision P2.
2 Likes

This item 13 is present in spag.fig, but is not described in the manual file pfmanual.chm.

Thanks for the fixes. The problems that I reported regarding TOMS 534 are no longer present, and I can try SPAG on more old codes.

Hi
Please check that you are using Rev P2 of the manual. The latest releases do not include a .CHM file - just pure HTML. See online at plusFORT Manual or, in Windows, at file:///C:/plusFORT/802/manual/index.html.

Thanks
JA

1 Like

Sorry, my mistake. I let PF 8.02 install itself in the same directory where I had 8.01 previously installed, and failed to check the date on pfmanual.chm .

I just downloaded this new version and as before, I get the message from “MIcroSoft Defender SmartScreen” (what’s in a name, eh?) that it prevented the installation program from running, as the publisher is unknown. I guess you are aware of this, but it may be a problem for some people.

You’re right. I’m in the process of updating the installation instructions, and I’ll put something in about it. Similarly for Apple.

I have applied it to a few source files in one of the programs I maintain and saw a few things that would need improvement. I can send you the complete source files, but here are a few fragments:

 2210 format (    ' Note: one or more special waste loads - parameter/se
     &gment function SURF required' )

gets translated to:

99023             FORMAT (                                                                                                         &
     &' Note: one or more special waste loads - parameter/se                                                                       &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                                             &
     &                                                                                                      gment function SURF req&
     &uired')

This series of IF statements with GOTOs leads to deep nesting, which does not make the source code clearer I must say:

      if ( gettoken ( modid1, ierr2 ) .gt. 0 ) goto 100
      if ( gettoken ( modid2, ierr2 ) .gt. 0 ) goto 100
      write ( lun( 2)  ) modid1, modid2
      write ( lunut  , 2010 ) modid1, modid2
      if ( gettoken ( runid1, ierr2 ) .gt. 0 ) goto 100
      if ( gettoken ( runid2, ierr2 ) .gt. 0 ) goto 100
      write ( lun( 2)  ) runid1, runid2
      write ( lunut  , 2020 ) runid1, runid2

That is:

            IF ( gettoken(modid1,ierr2)<=0 ) THEN
               IF ( gettoken(modid2,ierr2)<=0 ) THEN
                  WRITE (Lun(2)) modid1 , modid2
                  WRITE (lunut,99002) modid1 , modid2
99002             FORMAT (//' Model :            ',A40,/20X,A40)
                  IF ( gettoken(runid1,ierr2)<=0 ) THEN
                     IF ( gettoken(runid2,ierr2)<=0 ) THEN
                        WRITE (Lun(2)) runid1 , runid2
                        WRITE (lunut,99003) runid1 , runid2
99003                   FORMAT (//' Run   :            ',A40,/20X,A40)

Of course, it is a rather idiosyncratic piece of code and a tool like SPAG would have a hard time to do something nicer with it.

I noticed that no internal routines are used when the routine is already in a module.

Comments after a USE statements are displaced:

      use timers       !   performance timers
      ... 15 lines of comments

becomes:

      USE timers
      ... 15 lines of comments
                       !   performance timers

A lengthy set of string comparisons:

      IF ( IABS(ITYPE) .EQ. 1 .AND.
     *         (  CHULP(1: 5) .EQ. 'BLOCK'        .OR.
       .... quite a few more

becomes:

                  IF ( iabs(Itype)==1 .AND. (Chulp(1:5)=='BLOCK' .OR. Chulp(1:6)=='LINEAR' .OR. Chulp(1:4)=='ITEM' .OR. Chulp(1:13)&
                     & =='IDENTICALITEM' .OR. Chulp(1:12)=='USEDATA_ITEM' .OR. Chulp(1:7)=='FORITEM' .OR. Chulp(1:9)               &
                  ...

Again, rather idiosyncratic, but SPAG seems to be very eager to fill up the space that a continued statement takes. This also happens with long argument lists (not uncommon in old code).

Is there an option to turn this off?

1 Like

@Arjen, see section 2.7.10 of the plusFORT manual for the usage of the ASIS directive. This method, however, does require the user to sprinkle the input source with directives before processing with SPAG.

In similar circumstances, especially when the input source has Hollerith constants for literal strings in Format statements, I have found it more convenient to use my own utility to preprocess the source file before running SPAG on it.

Ah, thanks! That might be useful indeed.

You may find it worthwhile to edit the configuration files spag.fig, etc., and adjust the items that control the format of the output file (e.g., Item 121). I ran SPAG 121-72 on the following input

      program pmsg
      implicit none
      print 2210
 2210 format (    ' Note: one or more special waste loads - parameter/se
     &gment function SURF required' )
      end

and the output file contained

!*==PMSG.f90 processed by SPAG 8.02DA 08:36  6 Jan 2024
!!SPAG Open source Personal, Educational or Academic User  NON-COMMERCIAL USE - Not for use on proprietary or closed source code
PROGRAM pmsg
   IMPLICIT NONE
   PRINT 99001
99001 FORMAT (                                                          &
     &' Note: one or more special waste loads - parameter/segment functi&
     &on SURF required')
END PROGRAM pmsg

It would be nice to have an option to ask SPAG to respect word boundaries when it reformats text strings.

And with no switches at all, I get:

!*==PMSG.f90 processed by SPAG 8.02DA 12:09  7 Jan 2024
PROGRAM pmsg
   IMPLICIT NONE
   PRINT 99001
99001 FORMAT (' Note: one or more special waste loads - parameter/segment function SURF required')
END PROGRAM pmsg

The relevant setting is #51 - the input file record length. The default for F90 source form is 5120, but SPAG normally resets it to 72 for fixed source form.

You could change it yourself if needed, but I was initially at a loss to understand why it had not been reset automatically. I think the reason may be that there are TAB characters in your input source code. TABs do not normally play well with the strict column limits of fixed source form, as they mean different things at different times. When SPAG sees the TABs, it assumes the source is using the VAX style source form which embraces TABs and has no 72 character limit.

I will change this so that SPAG never pads continued character constants beyond column 72.

I checked the original source file: there are no tabs in there. I used SPAG 8.02DA. I can send you the original file and the result. I did not use any specific flags (I ran SPAG via the GUI).

BTW: My earlier conclusion about SPAG not using internal routines when the code is contained in a module, is not correct. When I removed the module, a very similar transformed file resulted. But even forcing the use of internal routines in that case did not result in them. Maybe I should have a closer look at the documentation :slight_smile:

Yes I think it would be helpful to see the original code.

On the internal subroutine question - it appears that the “force use of internal routines” option requires setting item 31 to 4 rather than 3. That option is available via the command line but not through PFFE. I’ll fix it.

Thanks

JA