The Perils of Polishing (LONG!)

I have been working with the SLATEC library. Following the closure of Absoft and Lahey during the last few months, I felt that I needed to preserve their versions of the library, which they included as free add-ons that their customers could choose to use. Lahey provided nice HTML formatted documentation of the SLATEC routines; these are still up on their web site, but for how long? Some users of Silverfrost FTN95 expressed interest in SLATEC, and I put some effort into getting the library to work with FTN95.

SLATEC is a large library, old but high quality, and huge (about 1900 subroutines), Any calculation, such as evaluating an integral or solving a least squares problem requires as many as 90 of those subroutines. The number of bugs divided by the lines of code is very low, but the number of bugs in the whole library is not negligible. Given these features, it is to be noted that finding and fixing those bugs is no easy task. Doing the same with a modern version such as Mehdi Chinoun’s ( GitHub - MehdiChinoune/SLATEC ) fine contribution is nearly impossible, as I found after trying. I had to accept that I should work with the Netlib Fortran 77 version, at least for the bug hunting work.

Some of the subroutines still contain ASSIGNed GOTOs, and there are a number of EQUIVALENCE and COMMON declarations, which modern compilers will complain about. Polishing the code, and doing so without introducing new errors, is difficult with such a large codebase.

Here is a sample of the kind of problems that I ran into. One of the subroutines is the Linpack routine CGEDI. A portion of the code in the CGEDI included in SLATEC:

C***FIRST EXECUTABLE STATEMENT  CGEDI
C
C     COMPUTE DETERMINANT
C
      IF (JOB/10 .EQ. 0) GO TO 70
         DET(1) = (1.0E0,0.0E0)
         DET(2) = (0.0E0,0.0E0)
         TEN = 10.0E0
         DO 50 I = 1, N
            IF (IPVT(I) .NE. I) DET(1) = -DET(1)
            DET(1) = A(I,I)*DET(1)
            IF (CABS1(DET(1)) .EQ. 0.0E0) GO TO 60
   10       IF (CABS1(DET(1)) .GE. 1.0E0) GO TO 20
               DET(1) = CMPLX(TEN,0.0E0)*DET(1)
               DET(2) = DET(2) - (1.0E0,0.0E0)
            GO TO 10
   20       CONTINUE
   30       IF (CABS1(DET(1)) .LT. TEN) GO TO 40
               DET(1) = DET(1)/CMPLX(TEN,0.0E0)
               DET(2) = DET(2) + (1.0E0,0.0E0)
            GO TO 30
   40       CONTINUE
   50    CONTINUE
   60    CONTINUE
   70 CONTINUE

I found that there appears to be a newer (appears; the dates indicate otherwise!) version at Netlib, www.netlib.org/linpack/cgedi.f , which is a bit easier on the eye:

c
c     compute determinant
c
      if (job/10 .eq. 0) go to 70
         det(1) = (1.0e0,0.0e0)
         det(2) = (0.0e0,0.0e0)
         ten = 10.0e0
         do 50 i = 1, n
            if (ipvt(i) .ne. i) det(1) = -det(1)
            det(1) = a(i,i)*det(1)
c        ...exit
            if (cabs1(det(1)) .eq. 0.0e0) go to 60
   10       if (cabs1(det(1)) .ge. 1.0e0) go to 20
               det(1) = cmplx(ten,0.0e0)*det(1)
               det(2) = det(2) - (1.0e0,0.0e0)
            go to 10
   20       continue
   30       if (cabs1(det(1)) .lt. ten) go to 40
               det(1) = det(1)/cmplx(ten,0.0e0)
               det(2) = det(2) + (1.0e0,0.0e0)
            go to 30
   40       continue
   50    continue
   60    continue
   70 continue

Still too many statement labels and GO TOs. I detected a DO WHILE loop between statements 30 and 40, and substituted the following, after checking to make sure that the statement labels in these lines were not referred to elsewhere:

            do while (cabs1(det(1)) .ge. ten)
               det(1) = det(1) * 0.1e0
               det(2) = det(2) + 1.0e0
            end do

This change (and similar changes to other portions of SLATEC) gave me code that was easier to work with. I found a number of bugs and fixed most of them. I felt pleased with the result, and started testing the corrected code with a number of compilers. All the 54 tests of SLATEC passed with a number of compilers (Intel, Lahey, Gfortran, Absoft, Silverfrost, CVF, all on Windows).

Today, I chanced to run tests using the old Microsoft FPS 4.0 compiler. Test 23 failed, and in a bad way – infinite loop in the block of code that I showed above, and it took a good bit of effort to locate the bug.

What the FPS 4.0 compiler did (with or without optimization requested) is to move the evaluation of the test clause to just before entering the loop.

FPS 4.0 is obsolete and has a reputation for being buggy, but my experience is that it is not as bad as it is made out to be. It has often generated correct code where newer compilers had failed. It provides one more hoop to make the test code jump through.

Some lessons that I learned from the exercise:

  • Polishing code can make compiler bugs surface, and introduce new bugs into the polished code regardless of which compiler is used

  • Putting common blocks, subroutines and function bodies into Fortran modules is definitely worthwhile, but not without risks. In a large body of code, doing so can make debugging impossible because of the complexity of module dependencies. (Watch out for mis-converted common blocks with different member names in different places, and agglomerations of place-holder variables! Code polishing tools fail spectacularly with these.)

  • There are many instances of scalars passed to array dummy arguments, work array sections of mismatched types used as actual arguments, etc. At the time that the original SLATEC codes were written, there probably was no way to avoid such usage, and it is not clear how to repair such code. If we ask the compiler to overlook such errors in some places, we have to live with not being able to check related errors in other places.

  • Old code has bugs, but so does new code. Compilers have bugs. Library code may have bugs. The programmer’s mind may have bugs (speaking about yours truly here).

  • When a program fails, is it because of bugs in the code, in the compiler, or in a library? We may fix code or resort to work-arounds if we have the source code and the necessary knowledge and ability. Compiler bugs will have to be reported, but may take months or years to fix through a support service.

  • If it ain’t broke, try harder!

  • Familiarity breeds respect.

In summary, I suggest that some modernization tasks should be performed in two stages. The first stage would focus on making the code easier to work with, replace spaghetti code with structured code, declare all variables and get rid of troublesome features such as common blocks. After this cleaned/converted code passes testing by a number of users and is more or less bug free, we may move on to the next stage of modernization, using module procedures, derived types, etc. None of my comments apply to code that is being written afresh now.

I look forward to your reactions. Thanks for reading this long post.

10 Likes

I have followed a rather similar process trying to modernise a library from Netlib for special functions. My strategy was to first write a test program that exercises a subset of the original routines, with special emphasis on the corner cases, and then modernise the code (free form, remove holleriths, goto statements, etc.). The test program should result in exactly the same output, with possible small deviations because of the use of standard implementations of such functions as sine and log. The second round was to put these functions and routines in modules. At all stages I compared the results from the test programs against the results with the original code.
That library holds only a few dozen functions and routines, so it is not as gargantuan as the SLATEC library, but still it requires a fair amount of work.
I have not read your post in full yet, but thought a first reaction to be appropriate.

1 Like

Thanks for your comments, Arjen. I think that all the special functions that are included in the Fullerton collections are also included in SLATEC. The SLATEC programs Test02, Test03 and Test04 cover single-precision real, double-precision real and single-precision complex versions of special functions.

I have seen code like this (or the f77 version) in several linear algebra libraries. The problem is that determinants can span many orders of magnitude and can easily exceed the floating point exponent range. These loops keep track separately of the exponent and the scaled value, avoiding the overflow and underflow problem. The posted code is for complex determinants, but the same issue arises for real determinants.

Another approach to handle this situation involves computing the log of the determinant rather than the determinant itself. The corner case is when the determinant is zero, which must be handled separately if it can occur. If the sign or complex phase is arbitrary, or if the intermediate signs/phases are important, then that must be tracked separately too.

In modern fortran, I think I would rather do this computation with a derived type. One member would be the det(1) value, i.e. a scaled value that is kept within some restricted range of values. The other member would be the integer exponent, stored above in det(2). Of course, if working with a legacy code, then this argument change would need to be propagated throughout.

I’ve noticed that the logic is almost always done with decimal exponents. This has always puzzled me. I would think that it would be more natural to use binary exponent ranges. That way, all of the exponent scaling can be done by just shifting the exponent values, there would be no need to actually do the floating point multiplications and divisions or to touch the mantissa. In modern fortran, there are the EXPONENT() and SCALE() intrinsics which allow this to be done in a portable way.

Having also attempted to refactor some of the SLATEC routines I agree with almost all of your comments. I’ve found that if you are serious about trying to refactor something as large as SLATEC it’s worthwhile to either spend some money to acquire existing refactoring tools or some time to develop your own. For me the first step is conversion to free form. About 20 years ago I wrote my own tool (in Fortran) to do that. I set it up to process multiple files in a directory in one run so I was able to process the entire SLATEC src directory less than 10 seconds or so. Basically, it just lowered case, changed comments and added F90 continuation syntax. The hard part then becomes deciding how many GO TO’s you want or need to get rid of and when to replace them with structured IF-THEN_ELSE blocks or DO WHILE statements. Due to the possibility that the compiler might chose to change the order of execution as an optimization you really do need to be careful with these if your goal is to exactly reproduce the output from the original code. While refactoring can be an interesting technical exercise you do need to do a cost/benefit analysis before undertaking something as large as SLATEC.

Re. the issue of passing scalars to array arguments, I’ve found the path of least resistance is to write a wrapper routine that takes in the scalar, creates local arrays of size one and then calls the array routine. This only takes a relatively few lines of code and you then create a generic interface to call both versions.

Tremendous effort!
Ideally, before refactoring you should write several tests for each of the 1900 procedures… :cold_face:

Or just forget Fortran and rewrite the entire code base in Julia or Rust since they are (according to some) suppose to be the future of programming :smiley:

Or don’t rewrite them and use the already written better replacements for most of them.
DASSL and DEPAC are worse than OrdinaryDiffEq.jl. FFTPACK is worse than FFTW. MKL has better BLAS implimentations, etc. SLATEC is just buggy old code that should have been abandoned 30 years ago.

But where is the fun in that :grin: How would us poor Fortran “practitioners” make it through the day without something to gripe about and a wall to beat our heads against.

1 Like

See the apt comment by @vmagnin

It helps greatly to approach refactoring and modernization as development effort driven by tests (TDD), documentation (DDD), and APIs (IDD). There can be no a priori limit to the number of stages or iterations. The three most important elements are always tests, tests, tests.

Several responses have mentioned tests. The package itself contains 54 test drivers, but each driver calls many of the computational routines, not just one. In response to the comments here, I recorded coverage data, and I found that of the 1793 callable subroutines/functions, 335 were not called or executed at all. A small number of these are service routines to report errors. Among the routines that were entered, a small number (e.g., I1MACH) were entered multiple times.

You may recognize that among these are several BLAS and Linpack routines that are often available in other distributions, or may have been supplanted by, e.g., Lapack routines.

Here is a list of the routines that were not covered in the test set.

Unused routines of SLATEC:

  BAKVEC  BALBAK  BANDR   BANDV   BCRH    BISECT  BNDSOL  BQR     BSPDOC  BSRH
  C1MERG  CBLKT1  CBLKTR  CCHDD   CCHEX   CCHUD   CCMPB   CDZRO   CG      CH
  CHKPR4  CHKPRM  CHKSN4  CHKSNG  CINVIT  CMGNBN  CMPCSG  CMPOSD  CMPOSN  CMPOSP
  CMPTR3  CMPTRX  CNBDI   COMBAK  COMHES  COMLR   COMLR2  CORTB   CPADD   CPEVLR
  CPROC   CPROCP  CPROD   CPRODP  CROTG   D1MERG  D9AIMP  D9GMIC  D9LGIT  DBEG
  DBHIN   DBLAT2  DBLAT3  DBNDSL  DCHDC   DCHDD   DCHEX   DCHK12  DCHK13  DCHK22
  DCHK23  DCHK32  DCHK33  DCHK42  DCHK43  DCHK52  DCHK53  DCHK62  DCHKE2  DCHKE3
  DCHUD   DCOPYM  DCPPLT  DDAINI  DDZRO   DEFC    DEFCMN  DEFE4   DEFER   DFULMT
  DGBCO   DGBDI   DGBMV   DGEDI   DGEMM   DGEMV   DGER    DLPDOC  DMAKE2  DMAKE3
  DMMCH   DMVCH   DNBDI   DOHTRL  DPBCO   DPBDI   DPBFA   DPBSL   DPCHCE  DPCHDF
  DPCHSW  DPODI   DPPCO   DPPDI   DPPFA   DPPSL   DPRVEC  DPRWPG  DPRWVR  DPTSL
  DQK31   DQK41   DQK51   DQRDC   DQRSL   DREADP  DRSCO   DSBMV   DSDSCL  DSICO
  DSIDI   DSIFA   DSISL   DSPCO   DSPDI   DSPFA   DSPMV   DSPR    DSPR2   DSPSL
  DSVCO   DSVDC   DSYMM   DSYMV   DSYR    DSYR2   DSYR2K  DSYRK   DTBMV   DTBSV
  DTIN    DTOUT   DTPMV   DTPSV   DTRCO   DTRDI   DTRMM   DTRMV   DTRSL   DTRSM
  DTRSV   DUIVP   DUVEC   DVECS   DWNLT1  DWNLT2  DWNLT3  DWRITP  DX      DX4
  DXLCAL  DY      DY4     EFC     EFCMN   EISDOC  ELMBAK  ELMHES  ELTRAN  FFTDOC
  FIGI    FIGI2   FULMAT  FUNDOC  HQR2    HSTCRT  HSTCS1  HSTCSP  HSTCYL  HSTPLR
  HSTSSP  HTRIB3  HTRID3  HW3CRT  I1MERG  ICOPY   IMTQL1  IMTQLV  INVIT   INXCA
  INXCB   INXCC   LA05ED  LA05ES  LDE     LDERES  LSSODS  MINFIT  MINSO4  MINSOL
  MPERR   MPMAXR  MPOVFL  MPUNFL  OHTROL  OHTROR  ORTBAK  ORTHO4  ORTHOG  ORTHOL
  ORTRAN  PCHCE   PCHDF   PCHDOC  PCHSW   PGSF    POIS3D  POISTG  POS3D1  POSTG2
  PPADD   PPGSF   PPPSF   PPSGF   PPSPF   PROC    PROCP   PRODP   PRVEC   PRWPGE
  PRWVIR  PSGF    QK31    QK41    QK51    QPDOC   QZHES   QZIT    QZVAL   QZVEC
  R9AIMP  R9GMIC  R9LGIT  RATQR   REBAK   REBAKB  REDUC   REDUC2  RG      RGAUSS
  RGG     RS      RSB     RSCO    RSG     RSGAB   RSGBA   RSP     RST     RT
  RUNIF   SBHIN   SCHDC   SCHDD   SCHEX   SCHUD   SCLOSM  SCPPLT  SDAINI  SDZRO
  SEPELI  SEPX4   SGBCO   SGBDI   SGEDI   SLPDOC  SNBDI   SODS    SOPENM  SPBCO
  SPBDI   SPBFA   SPBSL   SPELI4  SPELIP  SPODI   SPPCO   SPPDI   SPPFA   SPPSL
  SPTSL   SQRDC   SQRSL   SREADP  SSDSCL  SSICO   SSIDI   SSIFA   SSISL   SSPCO
  SSPDI   SSPFA   SSPSL   SSVDC   STIN    STOUT   STRCO   STRDI   STRSL   SVCO
  SVD     SVECS   SWRITP  SXLCAL  TEVLC   TINVIT  TQL1    TQL2    TRBAK1  TRBAK3
  TRED1   TRIDIB  TRIDQ   TRIS4   TRISP   TSTURM  UIVP    UVEC    WNLT1   WNLT2
  WNLT3   XERHLT  XGETUN  XSETUA  ZKSCL
Count of unused routines:  335

Before undertaking a full blown refactor of SLATEC, remember that parts of it have already been extracted, refactored, and modernized. For example, some the libraries I mention here have components from SLATEC. I think it’s best to save what’s worth saving and put it in more focused modern libraries that can be expanded and improved by domain experts, and that’s where any bugs should be fixed. We don’t need a drop-in replacement for the SLATEC library, people need to move on.

1 Like

I don’t use do while. Is the code block above equivalent to

        do
           if (cabs1(det(1)) .lt. ten) exit
           det(1) = det(1) * 0.1e0
           det(2) = det(2) + 1.0e0
        end do

And should the test actually be placed just before the end do to make it work with all compilers?

Can you please explain what the FPS 4.0 compiler did wrong? (actual vs. expected) … I feel I’m missing the point here.

The Fortran standard has a very clear statement about the do while ( logical expr ) construct:

If loop-control is [ , ] WHILE (scalar-logical-expr), the effect is as if loop-control were omitted and the following statement inserted as the first statement of the block:

IF (.NOT. (scalar- logical-expr )) EXIT

Welcome to the forum.

The compiler bug is not worth much discussion, since it pertains to an obsolete compiler. If you have that compiler, and you wish to demonstrate the bug to yourself, compile and run the attached self-contained test program.

tcgc.f90 (1.6 KB)

P.S. (Added a day later) Another old Fortran compiler, Intel 7.0 for Windows, also exhibits this bug.

Here is the output of the program when compiled with FPS 4.0:

  Iter            det(1)                    det(2)
     1    3.3003E+00  0.0000E+00    1.0000E+00  0.0000E+00  -
     2    3.3003E-01  0.0000E+00    2.0000E+00  0.0000E+00  -
     3    3.3003E-02  0.0000E+00    3.0000E+00  0.0000E+00  -
     4    3.3003E-03  0.0000E+00    4.0000E+00  0.0000E+00  -
     5    3.3003E-04  0.0000E+00    5.0000E+00  0.0000E+00  -
Iteration limit 2 in cged

Note that det(1) is being computed and updated correctly, but that stale values are being used in computing the expression that is used to decide when to terminate the loop. The code in the library did not monitor the iteration count to guard against this runaway behavior (nor would it be reasonable to expect such a guard).

In contrast, here is what the program should have output (and, for example, Gfortran 11.3 does):

   Iter            det(1)                    det(2)
      1    3.3003E+00  0.0000E+00    1.0000E+00  0.0000E+00  -

 Det =     3.3003E+00  0.0000E+00    1.0000E+00  0.0000E+00

Hi. Thank you for the quick response. No, I don’t have the old compiler, but I used to work in Fortran when I was a beginner, and I still feel a bit nostalgic about it. I got the impression I missed something vital in your post. Does the phrase “… is to move the evaluation of the test clause to just before entering the loop” mean, that the goto-based behavior is lost? In what way?

For Fortran 95 code (including the extensions from the Technical Report) g95 should also be used. It was a robust compiler and had options to detect bad code.

The post is being discussed at Hacker News.

1 Like

Have you noticed the stats of this discussion thread on the main page? There are 8.1k views, but the first post was made only two days ago… I don’t contest that this discussion is interesting! :grinning: but 8.1k seems anomalous. I have never seen such a behavior in the Discourse.

Thanks for posting it there. This topic is now #4 the most viewed on Fortran Discourse of all time.

2 Likes