Has anyone played with Sale's Fortran classification algorithm lately?

A 1971 article by Arthur Sale showed an algorithm for classifying Fortran 66 statements:

(PDF) The classification of FORTRAN statements | Arthur Sale - Academia.edu

I’ve been messing around with some ancient code lately and thought I’d try it out as well. Unfortunately the above pdf isn’t quite clear on a couple of the lookup table values. Has anyone here tried implementing it, and can confirm that I have the table values right? The lines with the ‘???’ comments are the ones in question:

c…Character array for tree scan
data kalp( 1),kalp( 2),kalp( 3),kalp( 4) /1hi,1hf,1hn,1hg/
data kalp( 5),kalp( 6),kalp( 7),kalp( 8) /1ho,1ht,1ho,1hi/ ! ???
data kalp( 9),kalp(10),kalp(11),kalp(12) /1hc,1ha,1ho,1hn/
data kalp(13),kalp(14),kalp(15),kalp(16) /1hm,1hm,1hp,1hr/
data kalp(17),kalp(18),kalp(19),kalp(20) /1he,1ha,1hd,1hl/
data kalp(21),kalp(22),kalp(23),kalp(24) /1ht,1hw,1hf,1ho/
data kalp(25),kalp(26),kalp(27),kalp(28) /1hu,1hd,1hi,1ha/ ! ???
data kalp(29),kalp(30),kalp(31),kalp(32) /1ho,1hu,1hw,1hs/
data kalp(33),kalp(34),kalp(35),kalp(36) /1ht,1hu,1he,1hn/
data kalp(37),kalp(38),kalp(39),kalp(40) /1hd,1hf,1hx,1hq/
data kalp(41),kalp(42),kalp(43),kalp(44) /1hb,1ha,1hl,1ha/
data kalp(45),kalp(46) /1hl,1hp /

A reasonably clear copy of the paper is available.
The characters used to initialize the array KALP are all capital letters; you have used lower case. The value for KALP(8) should be ‘(’, not ‘i’. The second line with ‘???’ looks OK, except for your use of lowercase letters.

2 Likes

Thanks so much! My comparison function KCOMP folds upper case to lower case. So having lower case letters in the DATA statement, and elsewhere, isn’t a problem.

Got it to work! It correctly classifies all 35 cases. The 36th case, for ‘rogue’ statements, isn’t always taken when it should due to limitations in the algorithm. For example it classifies a non-66-Standard PRINT statement as a PAUSE statement.

One thing that threw me for a bit was the assigned GO TO statement. In Fortran 66, the statement must look something like:

GO TO XYZZY, (21, 22, 23)

However in Fortran 77, one can elide the comma and parenthesized label list. If one just says “GO TO XYZZY”, as I initially tried to do, Sale will classify it as a regular GOTO since he looks for the comma to differentiate the two.

Fun stuff.

You have probably noted and corrected the following bug in the program as listed in the paper:

The array KDEC is declared and initialized with 10 elements, but statement 7 has DO 8 = 1, 12.

Interestingly, the article I copied the code from correctly loops from 1 to 10. I’ve tested the code with -fbounds-check and all is well.

It is kinda entertaining playing with old code that uses arithmetic IFs and the like so heavily. I’ve always used logical IFs. Perhaps because back in ~1970, I started out using PDP-8 BASIC - which had logical IFs. (In retrospect, and looking through docs at bitsavers, PDP-8 Fortran only had arithmetic IFs and computed GOTOs. But I really started using Fortran on DECsystem-10s - which had a full implementation of the language.)

Arthur Sale taught me FORTRAN 66 in 1973 !
We got ONE batch run a day when doing Fortran assignments on the IBM 7040 in the Basser computer basement. It wasn’t easy to find all our coding bugs in those days.
(might have been IBM 7030 and 026 punch card machines !!)

3 Likes

Obituary of Prof. Sale: Vale Professor Arthur Sale.

1 Like

I am curious to know if the program (after your corrections, as noted earlier), when run on its own source file as the input data file, classifies any lines with code 36.

If you do not mind, please share the source file. The program could be used to test whether a test source file is in conformance with “Basic ANSI Fortran”. This program has a special property: if the source code is “polished” or “refactored”, the polished version will label its own source mostly with code 36!

Here is the original code. The only change is that I used implicit none to help find typos and declare all variables, then commented it out. (If lower case bothers you, run it through ‘tr’.)

C.....To classify USAS Fortran records into 36 classes
      subroutine class (k, ityp)
C.....           *****
C.....Specification
c     implicit none
      integer k(72)
      integer ityp
C***********************************************************************
C*    USAS Fortran record classifier  -CLASS-                          *
C*                                                                     *
C*    Language... .  .                                                 *
C*        USAS Fortran (but nearly all USAS basic Fortran              *
C*    Input... .  .                                                    *
C*        K - An integer array containing 72A1 characters (not altered)*
C*    Output... .  .
C*        ITYP - An integer type code from 1 to 36                     *
C*    Error exits... .  .
C*        None
C*    Subroutines required... .  .
C*        KCOMP - A machine dependent integer function that accepts    *
C*        as arguments an A1 character and a Hollerith constant, and   *
C*        returns 0 if they represent the same character, otherwise 1  *
C*    Type codes... .  .
C*         1 comment      2 continuation 3 assignment   4 assign       *
C*         5 go to        6 assd. go to  7 comp. go to  8 arith. if    *
C*         9 logical if  10 do          11 continue    12 call
c*        13 return      14 stop        15 pause       16 read         *
C*        17 write       18 rewind      19 backspace   20 endfile      *
c*        21 format      22 integer     23 real        24 doub. prec.  *
c*        25 complex     16 logical     27 external    28 dimension    *
c*        29 common      30 equivalence 31 data        32 block data   *
c*        33 subroutine  34 function    35 end         36 *rogue*      *
c***********************************************************************
c.....Declarations
      integer kalp(46), ksuc(46), kfal(46), kdec(10), kf(8)
c.....Length of array k
      integer leng
      data leng /72/
c.....Alphameric constants and match tables
c.....The decimal digits for integer recognition
      data kdec(1), kdec(2), kdec(3), kdec(4), kdec( 5),
     1     kdec(6), kdec(7), kdec(8), kdec(9), kdec(10)
     2     /1h0, 1h1, 1h2, 1h3, 1h4, 1h5, 1h6, 1h7, 1h8, 1h9/
c.....The word -FUNCTION- for matching
      data kf(1), kf(2), kf(3), kf(4), kf(5),kf(6), kf(7), kf(8)
     1     /1hf, 1hu, 1hn, 1hc, 1ht, 1hi, 1ho, 1hn/
c.....Characters for recognition tests
c.....     which are..   c blank ( ) = h / * 0 ,
      integer kc,kblnk,klpar,krpar,keq
      data kc,kblnk,klpar,krpar,keq  /1hc,1h ,1h(,1h),1h=/
      integer kh,kslsh,kastk,kzero,kcma
      data kh,kslsh,kastk,kzero,kcma /1hh,1h/,1h*,1h0,1h,/
c.....Character array for tree scan
      data kalp( 1),kalp( 2),kalp( 3),kalp( 4) /1hi,1hf,1hn,1hg/
      data kalp( 5),kalp( 6),kalp( 7),kalp( 8) /1ho,1ht,1ho,1h(/
      data kalp( 9),kalp(10),kalp(11),kalp(12) /1hc,1ha,1ho,1hn/
      data kalp(13),kalp(14),kalp(15),kalp(16) /1hm,1hm,1hp,1hr/
      data kalp(17),kalp(18),kalp(19),kalp(20) /1he,1ha,1hd,1hl/
      data kalp(21),kalp(22),kalp(23),kalp(24) /1ht,1hw,1hf,1ho/
      data kalp(25),kalp(26),kalp(27),kalp(28) /1hu,1hd,1hi,1ha/
      data kalp(29),kalp(30),kalp(31),kalp(32) /1ho,1hu,1hw,1hs/
      data kalp(33),kalp(34),kalp(35),kalp(36) /1ht,1hu,1he,1hn/
      data kalp(37),kalp(38),kalp(39),kalp(40) /1hd,1hf,1hx,1hq/
      data kalp(41),kalp(42),kalp(43),kalp(44) /1hb,1ha,1hl,1ha/
      data kalp(45),kalp(46)                   /1hl,1hp        /
c.....Succeed link for tree scan
      data ksuc( 1),ksuc( 2),ksuc( 3),ksuc( 4) /  2, -8,-22,  5/
      data ksuc( 5),ksuc( 6),ksuc( 7),ksuc( 8) /  6,  7,  8, -7/
      data ksuc( 9),ksuc(10),ksuc(11),ksuc(12) / 10,-12, 12,-11/
      data ksuc(13),ksuc(14),ksuc(15),ksuc(16) / 14,-29,-25, 17/
      data ksuc(17),ksuc(18),ksuc(19),ksuc(20) / 18, 19,-16,-23/
      data ksuc(21),ksuc(22),ksuc(23),ksuc(24) /-13,-18, 24,-21/
      data ksuc(25),ksuc(26),ksuc(27),ksuc(28) /-34, 27,-28,-31/
      data ksuc(29),ksuc(30),ksuc(31),ksuc(32) / 30,-24,-17, 33/
      data ksuc(33),ksuc(34),ksuc(35),ksuc(36) /-14,-33, 36, 37/
      data ksuc(37),ksuc(38),ksuc(39),ksuc(40) / 38,-20,-27,-30/
      data ksuc(41),ksuc(42),ksuc(43),ksuc(44) / 42,-19,-32, -4/
      data ksuc(45),ksuc(46)                   /-26,-15        /
c.....Fail link for tree scan
      data kfal( 1),kfal( 2),kfal( 3),kfal( 4) /  4,  3,-36,  9/
      data kfal( 5),kfal( 6),kfal( 7),kfal( 8) /-36,-36,-36, -5/
      data kfal( 9),kfal(10),kfal(11),kfal(12) / 16, 11,-36, 13/
      data kfal(13),kfal(14),kfal(15),kfal(16) /-36, 15,-36, 23/
      data kfal(17),kfal(18),kfal(19),kfal(20) /-36, 21, 20,-36/
      data kfal(21),kfal(22),kfal(23),kfal(24) / 22,-36, 26, 25/
      data kfal(25),kfal(26),kfal(27),kfal(28) /-36, 31, 28, 29/
      data kfal(29),kfal(30),kfal(31),kfal(32) /-36,-36, 32, 35/
      data kfal(33),kfal(34),kfal(35),kfal(36) / 34,-36, 41, 39/
      data kfal(37),kfal(38),kfal(39),kfal(40) /-36,-35, 40,-36/
      data kfal(41),kfal(42),kfal(43),kfal(44) / 44, 43,-36, 45/
      data kfal(45),kfal(46)                   / 46,-36        /
c..... Misc declarations
      integer isw
      integer jch, jcma, jeq, jholl, jsave, jsw, jtyp
      integer j, l
      integer kcomp
      external kcomp
c.......................................................................
c.....Program entry point
c.....First check for comment
      if (kcomp (k(1),kc)) 2,1,2
    1 jtyp=1
      go to 55
c.....Then check for a continuation record
    2 if (kcomp (k(6),kblnk)) 3,5,3
    3 if (kcomp (k(6),kzero)) 4,5,4
    4 jtyp=2
      go to 55
c.......................................................................
c.....Initialize the loop
    5 jsw=0
      isw=0
      jeq=0
      jcma=0
      jholl=0
      jsave=kblnk
c.....Assignment scan loop
      do 26 j=7,leng
        jch=k(j)
        if (kcomp (jch,kblnk)) 6,26,6
c.....Its not blank, is Hollerith switch on ... .  .
    6   if (jholl) 12,12,7
    7   do 8 l=1,10
          if (kcomp (jch,kdec(l))) 8,10,8
    8   continue
c.....First time, no integer means not Hollerith
        if (jholl-1) 11,11,9
c.....  Otherwise look for the H
    9   if (kcomp (jch,kh)) 11,32,11
c.....Still fits Hollerith constant syntax
   10   jholl=jholl+1
        go to 25
c.....Not a Hollerith constant, set switch off
   11   jholl=0
c.....Test other characters (),=/*
   12   if (kcomp (jch,klpar)) 13,20,13
   13   if (kcomp (jch,krpar)) 14,18,14
   14   if (kcomp (jch,kcma )) 15,22,15
   15   if (kcomp (jch,keq  )) 16,23,16
   16   if (kcomp (jch,kslsh)) 17,21,17
   17   if (kcomp (jch,kastk)) 25,21,25
c.....Right parenthesis found
   18   jsw=jsw-1
        if (jsw) 19,19,25
c.....Set switch to allow only one more non-blank character
   19   isw=1
        go to 26
c.....Left parenthesis found
   20   jsw=jsw+1
c.....Set Hollerith switch for ( , / *
   21   jholl=1
        go to 25
c.....Comma found, check level
   22   if (jsw) 30,30,21
c.....Equals sign found, check level
   23   if (jsw) 24,24,32
   24   jeq=1
c.....Test if terminated by switch set
   25   if (isw) 26,26,27
c.....End of assignment scan loop
   26 continue
      go to 28
c.....Save last character if terminated early
   27 jsave=jch
c.......................................................................
c.....Leave scan and come here if ... .  .
C.....    no more characters
c.....    one non-blank character after a right parenthesis
c.....not a do, might be assignment
   28 if (jeq) 32, 32, 29
   29 jtyp=3
      go to 55
c.......................................................................
c.....Leave scan and come here if ... .  .
c.....    an upper level comma found
c.....might be a DO, not an assignment
   30 jcma = 1
      if (jeq) 32,32,31
   31 jtyp=10
      go to 55
c.......................................................................
c.....Leave scan and come here if ... . ..
c.....    a Hollerith constant found
c.....    an equals in parentheses
c.....    filure of DO and assignment tests
c.....neither a DO nor an assignment
c.....enter the keyword classificaation
   32 j=1
      isw=7
   33 jch=k(isw)
c.....If a blank, ignore, get the next
      if (kcomp (jch,kblnk)) 34, 37, 34
c.....Test against current tree character
   34 if (kcomp (jch,kalp(j))) 35, 36, 35
c.....Character does not match, try the next in tree
   35 j=kfal(j)
      if (j) 39, 39, 34
c.....Character matches, try next in tree and in record
   36 j=ksuc(j)
      if (j) 39, 39, 37
   37 isw=isw+1
      if (isw-leng) 33, 33, 38
c.....If run out of characters, force a rogue type
   38 jch=kblnk
      go to 35
c.......................................................................
c.....Classification completed, form type code
   39 jtyp=-j
c.....Check to seeif more treatment needed
      if (jtyp-5 ) 55,45,40
   40 if (jtyp-8 ) 55,43,41
   41 if (jtyp-22) 55,42,42
   42 if (jtyp-26) 47,47,55
c.......................................................................
c.....Logical IF separation test
   43 do 44 l=1,10
      if (kcomp (jsave,kdec(l))) 44,55,44
   44 continue
      jtyp=9
      go to 55
c.......................................................................
c.....Separate assigned and unconditional GOTOs
   45 if (jcma) 55,55,46
   46 jtyp=6
      go to 55
c.......................................................................
c.....Check whether this is a type statement or a typed function
   47 l=11
      go to 52
   48 l=l+1
      if (l-leng) 49,49,55
   49 if (kcomp (k(l),kblnk))   50,48,50
   50 if (kcomp (k(l),kf(isw))) 51,53,51
   51 if (isw-1) 52,48,52
   52 isw=1
      go to 50
   53 isw=isw+1
      if (isw-8) 48,48,54
   54 jtyp=34
c.......................................................................
c.....All results come here for return
   55 ityp=jtyp
      return
      end

A little module with a wrapper for the classifier, and my case-insensitive KCOMP function:

module sale_mod
  implicit none

  character(12) :: line_type(36) = [  &
      'comment     ', 'continuation', 'assignment  ', 'assign      ',  &  !  1 -  4
      'go to       ', 'assd. go to ', 'comp. go to ', 'arith. if   ',  &  !  5 -  8
      'logical if  ', 'do          ', 'continue    ', 'call        ',  &  !  9 - 12
      'return      ', 'stop        ', 'pause       ', 'read        ',  &  ! 13 - 16
      'write       ', 'rewind      ', 'backspace   ', 'endfile     ',  &  ! 17 - 20
      'format      ', 'integer     ', 'real        ', 'doub. prec. ',  &  ! 21 - 24
      'complex     ', 'logical     ', 'external    ', 'dimension   ',  &  ! 25 - 28
      'common      ', 'equivalence ', 'data        ', 'block data  ',  &  ! 29 - 32
      'subroutine  ', 'function    ', 'end         ', '*rogue*     ' ]    ! 33 - 36

contains

  integer function sale_classify (line) result (res)
    character(*), intent(in) :: line

    character(72) :: line72
    integer :: hline(72)
    integer :: i

    line72 = line
    read (line72,'(72a1)') hline
    res = -42
    call class (hline, res)
    if (res < 1 .or. res > 36) then
      print *, 'returned out of range: ', res, ' - FAILed'
    end if

  end function

! Fortran 66 style hollerith routines.  Assume 1H (left justified, blank filled) per
! integer word.

! Use C Interop so that they can be called outside the module environment.

  integer function kcomp (i, j) bind (c, name='kcomp_')
    implicit none
    integer :: i, j

    kcomp = 1
    if (tolowerh (i) .eq. tolowerh (j)) kcomp = 0

  end function

  integer function tolowerh (i)
    implicit none
    integer :: i

    integer :: ic

    ic = 0
    call mvbits (i, 24, 8, ic, 0)
    if (ic .ge. 65 .and. ic .le. 90) ic = ic + 32
    tolowerh = i
    call mvbits (ic, 0, 8, tolowerh, 24)

  end function

end module

And a little driver program to try all the cases:

program test_driver
  use sale_mod
  implicit none

  call classify ('c', 1)
  call classify ('     1 x, y, z', 2)
  call classify ('      x = 42.42', 3)
  call classify ('      do i=1.10', 3)
  call classify ('      assign 10 to x', 4)

  call classify ('      goto 99', 5)
  call classify ('      go to x,(10,20,30)', 6) ! Assigned GOTO
  call classify ('      go to (i), 10, 20, 30', 7) ! Computed GOTO
  call classify ('      if (mystery) 10, 30, 50', 8) ! Arith IF

  call classify ('      if (j .eq. k) k = 42', 9)
  call classify ('      do i=1, 42', 10)
  call classify (' 100  continue', 11)
  call classify ('      call xyzzy (42)', 12)

  call classify ('      return', 13)
  call classify ('      stop', 14)
  call classify ('      pause 12', 15)
  call classify ('      read (5,100) x, y', 16)

  call classify ('      write (6,100) x, y', 17)
  call classify ('      rewind (1)', 18)
  call classify ('      backspace 2', 19)
  call classify ('      endfile 4', 20)

  call classify (' 100  format (12h hello world)', 21)
  call classify ('      integer i, j, k', 22)
  call classify ('      real x, y, z, z2', 23)
  call classify ('      doubleprecision da, db, dc', 24)

  call classify ('      complex ca, cb, cc(12)', 25)
  call classify ('      logical la, lb(12)', 26)
  call classify ('      external kompar, kompay', 27)
  call classify ('      dimension a(42)', 28)

  call classify ('      common /mycom/ a, b, c', 29)
  call classify ('      equivalence (a, b)', 30)
  call classify ('      data x/42.42/', 31)
  call classify ('      blockdata bd1', 32)

  call classify ('      subroutine xyzzy (a, b, 3)', 33)
  call classify ('      integer function myfunc (12)', 34)
  call classify ('      end', 35)
  call classify ('      print *, 42', 36) ! Fortran 77

contains

  subroutine classify (l1, expected)
    character(*), intent(in) :: l1
    integer, intent(in) :: expected

    integer :: line_class

    line_class = sale_classify (l1)
    print *, 'classified as: ', line_type(line_class),  &
        ' - ', merge ('passed', 'FAILed', line_class == expected)

  end subroutine

end program

Holy cow!

I wrote a little reader program so that I could classify the lines in the classifier file. My ‘tolowerh’ function fails! It looks like gfortran may be doing Hollerith constants wrong…

A Hollerith constant with fewer characters than an integer (or other) type can hold should left-justify, and blank fill, the remainder of the word. E.g., 1hc, in a 4-byte integer, should put 'c ' in the integer. It looks like gfortran is right-justifying (' c') instead!

So the corrected code for gfortran is:

  integer function tolowerh (i)
    implicit none
    integer :: i

    integer :: ic

    ic = ibits (i, 0, 8)
    if (ic .ge. 65 .and. ic .le. 90) ic = ic + 32
    tolowerh = i
    call mvbits (ic, 0, 8, tolowerh, 0)

  end function

This bears more investigation…

Maybe this is a little-/big-endian addressing convention? If so, then the location of the bits is not defined by the standard, which avoids such low-level details.

No - the Fortran 66 Standard spells out what is supposed to happen. Left-justified, blank filled. Look at section 7.2.3.8(2) where it describes the A edit descriptor and formatted I/O.
The '77 Standard moved it all to Appendix C, which basically says the same.

Gfortran is definitely wrong. Try:

program holler
  implicit none

! Experiment with F66 Hollerith data

  integer :: hw(12)
  data hw/1hh, 1he, 1hl, 1hl, 1ho, 1h , 1hw, 1ho, 1hr, 1hl, 1hd, 1h!/

  integer :: i

  write (*,'(12a1)') hw
  write (*,'(12(1x,z8))') hw

end program

Compiling/running with gfortran shows:

$ ./a.out
hello world!
 20202068 20202065 2020206C 2020206C 2020206F 20202020 20202077 2020206F 20202072 2020206C 20202064 20202021
$

Well, maybe not. Consider:

program holler2
  implicit none

! Experiment with F66 Hollerith data

  integer :: h1234(4)
  data h1234/1ha, 2hab, 3habc, 4habcd/

  write (*,'(4(1x,a4))') h1234
  write (*,'(4(1x,z8))') h1234

end program

Which prints out:

$ ./a.out
 a    ab   abc  abcd
 20202061 20206261 20636261 64636261

For little-endian addressing, the hex values that you show are left-justified and blank filled. Here is perhaps a simpler way to show this.

program hollerith
   integer :: i, hw(4)
   data hw /1hA, 2hAB, 3hABC, 4hABCD/
   write(*,'(a4,1x,z8)') (hw(i),hw(i), i=1,4)
end program hollerith

$ gfortran hollerith.f90 && a.out
[warnings for hollerith conversions removed]
A    20202041
AB   20204241
ABC  20434241
ABCD 44434241

As you can see, all four of those values are left justified and space filled. The letter ‘A’ is always the low-order 8 bits, which is the first character with little-endian addressing conventions, the letter ‘B’ is in the next 8-bit position, and so on.

Perhaps you could show what you think the output should have been?

1 Like

LOL!

Goes to show how long it’s been since I’ve done this level of Hollerith hacking…

Here is my attempt to prepare a working version of Sale’s program

! See paper "The classification of FORTRAN statements", by Arthur Sale,
!   The Computer Journal, 1971.
!    https://academic.oup.com/comjnl/article/14/1/10/356185
!   Modified source below uses Fortran 77 character variables instead of
!   the original Hollerith. Input source file to be classified must be
!   Fortran 66 basic level, all uppercase.
!
      program tsale
         character*1 line(72)
         integer ityp
         character(len=8) :: cname(36) = (/ 'Comment ','Continue',
     +       'Assignmt','Assign  ','Go To   ','Asg GoTo','CompGoTo',
     +       'Arith If','Logic If','DO      ','Continue','Call Sub',
     +       'Return  ','Stop    ','Pause   ','Read    ','Write   ',
     +       'Rewind  ','BakSpace','EndFile ','Format  ','Integer ',
     +       'Real    ','Double  ','Complex ','Logical ','External',
     +       'Dimensin','Common  ','Eqvalenc','Data    ','Blk Data',
     +       'Function','Subroutn','End     ','Rogue   ' /)
         do
            read(*,'(72A1)',end=10)line
            call class(line,ityp)
            print '(72A1,2x,i4,2x,A8)',line,ityp,cname(ityp)
         end do
   10    stop
      end

      subroutine class(k,ityp)
         integer ityp,jtyp,jsw,isw,jeq,jcma,jholl,j,l
         integer leng
         parameter (leng = 72)
         character jsave,jch,k(72),kdec(10)
         character*1 :: kc = 'C', kblnk = ' ', klpar = '(',
     +      krpar = ')', keq = '=', kh = 'H', kslsh = '/', kastk = '*',
     +      kzero = '0', kcma = ','
         character*8 :: kf = 'FUNCTION'
         character*1 kalp(46)
         integer kfal(46)
         integer ksuc(46)
         data kdec/'0','1','2','3','4','5','6','7','8','9'/
         data kalp/'I','F','N','G',  'O','T','O','(',  'C','A','O','N',
     +      'M','M','P','R',  'E','A','D','L',  'T','W','F','O',
     +      'U','D','I','A',  'O','U','W','S',  'T','U','E','N',
     +      'D','F','X','Q',  'B','A','L','A',  'L', 'P'/
         data ksuc/2,-8,-22,5,  6,7,8,-7, 10,-12,12,-11, 14,-29,-25,17,
     +     18,19,-16,-23, -13,-18,24,-21, -34,27,-28,-31, 30,-24,-17,33,
     +    -14,-33,36,37,  38,-20,-27,-30,  42,-19,-32,-4,  -26,-15/
         data kfal/4,3,-36,9,  -36,-36,-36,-5,  16,11,-36,13,
     +     -36,15,-36,23,  -36,21,20,-36,  22,-36,26,25,  -36,31,28,29,
     +     -36,-36,32,35,   34,-36,41,39, -36,-35,40,-36,  44,43,-36,45,
     +      46,-36/
! Initialize loop
         if(k(1).eq.kc)then
            jtyp = 1
            goto 55
         endif
         if(k(6).ne.kblnk .and. k(6).ne.kzero)then
            jtyp=2
            goto 55
         endif
         jsw=0
         isw=0
         jeq=0
         jcma=0
         jholl=0
         jsave=kblnk
! Assignment scan loop
         do j=7,72
            jch=k(j)
            if(jch.eq.kblnk)cycle
            if(jholl <= 0)goto 12
            do l = 1,10
               if(jch.eq.kdec(l))then
! Stil fits Hollerith constant syntax
                  jholl=jholl+1
                  goto 25
               endif
            end do
            if(jholl .le. 1)goto 11
            if(jch .eq. kh)goto 32
! Not a Hollerith constant, set switch off
   11       jholl = 0
   12       if(jch.eq.klpar)goto 20
            if(jch.eq.krpar)goto 18
            if(jch.eq.kcma)goto 22
            if(jch.eq.keq)goto 23
            if(jch.eq.kslsh)goto 21
            if(jch.eq.kastk)goto 21
            goto 25
! Right parenthesis found
   18       jsw=jsw-1
            if(jsw.gt.0)goto 25
            isw=1
            cycle
   20       jsw=jsw+1
   21       jholl=1
            goto 25
   22       if(jsw.gt.0)then
               jholl=1
               goto 25
            else
               goto 30
            endif
! Equals sign found, check level
   23       if(jsw > 0)goto 32
            jeq = 1
! Test if terminated by switch set
   25       if(isw.gt.0)goto 27
! End of assignment scan loop
         end do ! j
         goto 28
! Save last character if terminated early
   27    jsave = jch
! Leave scan and come here if (i) no more characters
!     (ii) One non-blank after right parenthesis
!     Not a DO, might be assignment
   28    if(jEQ > 0)then
            jtyp=3
            goto 55
         else
            goto 32
         endif
! Leave scan and come here if (i) Upper level comma found,
!  (ii) might be a DO, not an assignment
   30    jcma=1
         if(jEQ > 0)then
            jtyp = 10
            goto 55
         endif
! Leave scan and come here if (i) a Hollerith constant, (ii) equals in parentheses
!   (iii) Failure of DO and assignment tests (iv) neither DO nor assignment
! Enter keyword classification
   32    j=1
         isw=7
   33    jch=k(isw)
! If blank, ignore and go on to next
         if(jch.eq.kblnk)goto 37
! Test against current tree character
   34    if(jch.eq.kalp(j))goto 36
! Character does not match, try next in tree
   35    j =kfal(j)
         if(j.le.0)goto 39
         goto 34
! Character matches, try next in tree and in record
   36    j=ksuc(j)
         if(j.le.0)goto39
   37    isw=isw+1
         if(isw.le.leng)goto 33
         ! If run out of characters, force rogue type
         jch=kblnk
         goto 35
! Classification completed, form type code
   39    jtyp=-j
! Check to see if more treatment needed
         if(jtyp .lt. 5)goto 55
         if(jtyp.eq.5)then
            if(jcma.gt.0) jtyp = 6
            goto 55
         endif
         if(jtyp.lt.8)goto 55
         if(jtyp.eq.8)goto 43
         if(jtyp.lt.22.or.jtyp.gt.26)goto 55
         if(jtyp.le.26)goto 47
         print *,'Jtyp = ',jtyp
         stop 'Case not handled'
! Distinguish logical IF
   43    do l=1,10
            if(jsave.eq.kdec(l))goto 55
         end do
         jtyp=9
         goto 55
! Distinguish between assigned goto and unconditional goto
   45    if(jcma.gt.0) jtyp=6
         goto 55
! Check whether type statement or typed function
   47    l=11
         isw = 1
         goto 50
   48    l=l+1
         if(l.gt.leng)goto 55
         if(k(l).eq.kblnk)goto 48
   50    if(k(l).eq.kf(isw:isw))then
            isw=isw+1
            if(isw.le.8)goto 48
            jtyp=34
            goto 55
         endif
         if(isw.eq.1)goto 48
         isw=1
         goto 50
   55    ityp=jtyp
         return
      end

and a test input source

C AREA OF A TRIANGLE WITH A STANDARD SQUARE ROOT FUNCTION
C INPUT - TAPE READER UNIT 5, INTEGER INPUT
C OUTPUT - LINE PRINTER UNIT 6, REAL OUTPUT
C INPUT ERROR DISPLAY ERROR OUTPUT CODE 1 IN JOB CONTROL LISTING
C      READ INPUT TAPE 5, 501, IA, IB, IC
      READ(*,*) IA, IB, IC
C  501 FORMAT (3I5)
C IA, IB, AND IC MAY NOT BE NEGATIVE OR ZERO
C FURTHERMORE, THE SUM OF TWO SIDES OF A TRIANGLE
C MUST BE GREATER THAN THE THIRD SIDE, SO WE CHECK FOR THAT, TOO
      IF (IA) 777, 777, 701
  701 IF (IB) 777, 777, 702
  702 IF (IC) 777, 777, 703
  703 IF (IA+IB-IC) 777, 777, 704
  704 IF (IA+IC-IB) 777, 777, 705
  705 IF (IB+IC-IA) 777, 777, 799
  777 STOP 1
C USING HERON'S FORMULA WE CALCULATE THE
C AREA OF THE TRIANGLE
  799 S = FLOAT(IA + IB + IC) / 2.0
      AREA = SQRT( S * (S - FLOAT(IA)) * (S - FLOAT(IB)) *
     +     (S - FLOAT(IC)))
C      WRITE OUTPUT TAPE 6, 601, IA, IB, IC, AREA
      WRITE(*, 601) IA, IB, IC, AREA
  601 FORMAT (4H A= ,I5,5H  B= ,I5,5H  C= ,I5,8H  AREA= ,F10.2,
     +        13H SQUARE UNITS)
      END
1 Like

There is a Fortran-66 lexer on Github. Perhaps it is more complete in covering all aspects of that language than the classifier of this thread?

1 Like

Thanks - I will take a look at it. Clearly the Sale algorithm just does a rough cut, and further syntax checking needs to be done for many of the cases.

BTW, at label 12 in your code, those IF statements look like a prime candidate for restructuring some of the code into a SELECT CASE construct.