A Comparison of Programming Languages in Economics

I looked at the Fortran code and noticed that the author does not use the d0 suffix for double precision real literals. Not sure if that affects correctness and performance.

1 Like

I don’t see why it will affect performance nor correctness; the compiler will transform an integer or real which was meant to be double precision or whatever else as needed. However, I see it as a bad programming practice (it is extremely common in C, by the way). It will also cause tons of warnings if you have the -Wconversion or -Wconversion-extra compilation flags enabled (which I always have). Or similar flags in compilers other than gfortran.

I have several times written example programs to demonstrate that this can affect answers. Usually it doesn’t impact things in a significant way, but if you are using an algorithm that is highly sensitive it could.

The issue is that the conversion to double precision doesn’t happen before the conversion to floating point representation. Thus, if the number is not exactly representable as a 32 bit floating point number, it will be “rounded” to the nearest 32 bit value and then converted to a 64 bit value. If you specify the suffix, then the “rounding” will be directly to the nearest 64 bit value.

Like I said, most of the time the difference is inconsequential, but it is there.

program main
  implicit none

  real(8), parameter :: tolerance1 = 0.0000001
  real(8), parameter :: tolerance2 = 0.0000001d0

  print '(F20.16)', tolerance1
  print '(F20.16)', tolerance2
end program main

Using fpm run (GFortran 11.2.0):

  0.0000001000000012
  0.0000001000000000
1 Like

The 15 different ways to declare real variables in fortran is a HUGE source of errors and confusion. This includes using singles and thinking you are getting doubles.

If it was up to me Fortran would depreciate all but one way: real(wp) and x = 1.0_wp, where wp (or whatever you want to name it) must be declared. Away with 1.0, 1.0d0, real(8), double precision, real*8, etc. etc.

And yes, I know this will never happen.

2 Likes

It’s convenient to write

integer, parameter :: dp = kind(1.0d0)

and easier to remember and less intimidating than

integer, parameter :: dp = selected_real_kind(15, 307)

It would be irregular to prohibit a literal such as 8 where a named constant can appear. Regarding real*8, the committee cannot remove from the language what was never put in.

1 Like
real(8) :: x
x = 5d-1

This seems very straightforward and convenient to me. I do agree that real*8 and double precision should be discouraged.

But real(8) is unfortunately not portable:

❯ nagfor test.f90 
NAG Fortran Compiler Release 7.1(Hanzomon) Build 7104
Questionable: test.f90, line 3: Variable X set but never referenced
Error: test.f90, line 1: KIND value (8) does not specify a valid representation method
Errors in declarations, no further processing for $main$
[NAG Fortran Compiler error termination, 1 error, 1 warning]
1 Like

I am aware of that. In my area, people only use GFortran. Most of the Fortran tagged questions at Stack Overflow also use GFortran. So that’s not a problem for me.

Yep I’m well aware of it. The vendors and the committee have always prioritized the legacy codes over trying to get new people to write Fortran. In my view, it’s one of the main reasons for Fortran’s decline.

2 Likes

If it were harder for people new to Fortran to come across real*8/real(8) , then they wouldn’t know that the compiler even accepts such usage. Maybe if we clean up the legacy codes, people will learn from good practice and not repeat archaic forms.

5 Likes

I agree with this point, and I believe this issue can be solved with better documentation. This paper and its author are highly regarded in computational economics. Even he misuses double precision real literals, no wonder so many others make this mistake.

1 Like

At least it’s not C++…

2 Likes

Wish it was a little more specific about the compiler options used and so on. Since the code
was so short it was easy enough to get some numbers. On a very modest Linux laptop old enough to be the same vintage as the report, using a “before” and “after” version of the code making no other changes running each 100 times I got

100 runs
       Average (secs)    | Shortest   | Longest
Before 1.102803080400001 | 1.08397698 | 1.24383700 
After  1.1099996502      | 1.09448707 | 1.21478796 

with gfortran, which is quite a bit different than the 1.6 seconds in the article, but it was a very
different platform (a MSWindows box).

so any performance difference is small. I would have to change the code or instrument it to determine it any better, but there is some chance it could be up to a 1% but I doubt it.

It does produce slightly different answers, as can be expected from the above discussion

I just used fpm(1) and copied the file over app/main.f90 and fired it up using the default debug release and then --profile release. The results shown are for --profile release. The interesting part is the two profiles ran significantly different speeds (debug mode was 1.9 seconds, where the release mode was 1.1 seconds). So not sure what compiler options were used, and the report saying ifort speed changed by something like 7 or 9 percent could be because multiple runs were not averaged and so on.

Note the test is so small, that simple changes like running to the screen versus writing to an unbuffered file give around a five percent change on my platform, so I do not think my results change the general conclusions of the paper significantly, it just points out there might be some questions about the small changes compared to the previous results and such.

The only code change was to define a “dp” type and change all the constants to “dp” and the real(8) to real(dp).

PS: There is an odd transpose of an array and then a repeated transpose back that looked odd, but I did not look into it but it caught my eye as I did not notice the array as being changed anywhere and the way it was repeated would slow down an interpreted language.

2 Likes

It’s even more convenient to write
integer, parameter :: dp = kind(1d0)

If the link to Fortran code provided upthread is truly reflective of their performance data in the first paper, the results mentioned in the paper for Fortran paper are unreliable. It’s rather poor instrumentation of the code, it’s absolutely at a beginner level. Immediately the lack of the following is glaring:

  • in situ the code can be instrumented to run a certain number of trials, say NTRIALS where it is a reasonable number such as a dozen. Discarding then the fastest and the slowest times, the CPU usage then be marked as the median of the rest. Then repeat the same with certain number of execution attempts, say NRUNS, and get the overall median,
  • use SYSTEM_CLOCK to get the tick counts using an integer type with a range of at least 10 (colloquially a 64-bit integer or better) rather than CPU_TIME to measure timing,
  • separate in code with suitable instrumentation the measurement of strictly numerical computing instructions from those involving output (and input) involving PRINT (and READ) statements. And report them distinctly. If the times toward IO are on the same order of magnitude as numerical operations, there is little to no reason for those keen on Fortran to pay much attention to such studies, the IO time will not be that competitive; besides, that will be of little practical interest any way.

It would be worth experimenting with a BLAS call to ?gemm for the operation:

     expectedValueFunction = matmul(mValueFunction,transpose(mTransition))

I also agree with @FortranFan’s observation to separate the I/O part. Currently, the printing is performed quite frequently, every 10 iterations.

The textbook Introduction to Computational Economics using Fortran also uses real*8 to declare double precision real numbers.
https://www.ce-fortran.com/source-codes/

Again, the changes do not immediately change the general conclusions, but
it seems like a good place to show how

  • the warning flags of the compiler should be used to their fullest
  • a few compiler options can affect speed significantly,
  • the often overlooked gprof(1) and gcov(1) tools can tell you a
    lot about how your code is used and where it spends time with
    very little effort.

So for users with the GCC programs gfortran, gprof, and gcov you should
give something like this a try if you are interested in information for
improving performance or testing coverage …

#!/bin/bash
(
exec 2>&1
# do everything in a scratch directory
SCR=scr_$(uuidgen)
mkdir -p $SCR
cp econ1.f90 $SCR/
cd $SCR
gfortran -g -pg -fprofile-arcs -ftest-coverage -O0 econ1.f90 -o econ1
./econ1
# better when you have a lot of procedures
gprof --brief econ1 gmon.out  
# better for short programs
gprof --brief --line econ1 gmon.out 
# coverage report
gcov --stdout econ1.f90
) >reports.out
exit

and simply compiling with -Warn -Wextra identified a few easily corrected
issues. Always a good practice to use all those optional warnings!

And I was off a bit on my guess about performance and using 64-bit
constants instead of 32-bit. Using a more rigorious timing test and 1 000,
executions it does slow it down on my platform 1.36 percent. I am still
a bit surprised, as almost all the usage of 32-bit values was combined
with 64-bit values.

By the way, for fun I built it using 128-bit values and got about the
same answer but it was 45 times slower (with gfortran). I expected a
slowdown but not of that order either. Live and learn.

So just using those example commands you can tell what lines of your
code were executed via gcov(1) (in this trivial case not so interesting
except for the counts but if building a test suite that is good info.).

And you can tell that three lines account for over 1/2 of the CPU time
consumed.

In this case the procedure report is essentially null, but in a big
program it quickly can identify which procedures are consuming the
CPU time.

gprof(1) and gcov(1) have their limitations but are often very useful. You
should read up on them and similar tools if this is interesting.

And as a footnote using unrolling and a few other switches on my platform
took ifort(1) to the top of the heap at 1`04``.

reports program output
 Steady State values
 Output:   0.56273143387317726      Capital:   0.17819828739139082      Consumption:   0.38453314648178644     
 Iteration:           1 Sup Diff:   5.2741593406882385E-002
 Iteration:          10 Sup Diff:   3.1346949265582680E-002
 Iteration:          20 Sup Diff:   1.8703459893196328E-002
 Iteration:          30 Sup Diff:   1.1165512033874614E-002
 Iteration:          40 Sup Diff:   6.6685417080749598E-003
 Iteration:          50 Sup Diff:   3.9842927486826163E-003
 Iteration:          60 Sup Diff:   2.3813118039122116E-003
 Iteration:          70 Sup Diff:   1.4236586450859789E-003
 Iteration:          80 Sup Diff:   8.5133977471318900E-004
 Iteration:          90 Sup Diff:   5.0920517522479170E-004
 Iteration:         100 Sup Diff:   3.0462324421209885E-004
 Iteration:         110 Sup Diff:   1.8226485632144573E-004
 Iteration:         120 Sup Diff:   1.0906950872535681E-004
 Iteration:         130 Sup Diff:   6.5276437362649098E-005
 Iteration:         140 Sup Diff:   3.9071082119535028E-005
 Iteration:         150 Sup Diff:   2.3388077119768091E-005
 Iteration:         160 Sup Diff:   1.4008644636964718E-005
 Iteration:         170 Sup Diff:   8.3913172028715621E-006
 Iteration:         180 Sup Diff:   5.0264745379280384E-006
 Iteration:         190 Sup Diff:   3.0108998635425266E-006
 Iteration:         200 Sup Diff:   1.8035522479920019E-006
 Iteration:         210 Sup Diff:   1.0803409159487742E-006
 Iteration:         220 Sup Diff:   6.4713169423136208E-007
 Iteration:         230 Sup Diff:   3.8763619381043668E-007
 Iteration:         240 Sup Diff:   2.3219657918627234E-007
 Iteration:         250 Sup Diff:   1.3908720930544405E-007
 Iteration:         257 Sup Diff:   9.7160356538061876E-008
  
 My check:  0.14654914369569541     
  
 Elapsed time is    1.73320293    

gprof by procedure

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
100.10      0.75     0.75        1   750.73   750.73  MAIN__

                        Call graph


granularity: each sample hit covers 2 byte(s) for 1.33% of 0.75 seconds

index % time    self  children    called     name
                0.75    0.00       1/1           main [2]
[1]    100.0    0.75    0.00       1         MAIN__ [1]
-----------------------------------------------
                                                 <spontaneous>
[2]    100.0    0.00    0.75                 main [2]
                0.75    0.00       1/1           MAIN__ [1]
-----------------------------------------------
 
Index by function name

   [1] MAIN__ (econ1.f90)
Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  Ts/call  Ts/call  name    
 28.69      0.22     0.22                             MAIN__ (econ1.f90:75 @ 2d76)
 21.35      0.38     0.16                             MAIN__ (econ1.f90:76 @ 2de1)
 16.02      0.50     0.12                             MAIN__ (econ1.f90:74 @ 2d22)
 10.68      0.58     0.08                             MAIN__ (econ1.f90:88 @ 2eb8)
  4.67      0.61     0.04                             MAIN__ (econ1.f90:73 @ 2cf5)
  4.00      0.64     0.03                             MAIN__ (econ1.f90:85 @ 2e71)
  2.67      0.66     0.02                             MAIN__ (econ1.f90:71 @ 2cba)
  2.67      0.68     0.02                             MAIN__ (econ1.f90:73 @ 2e21)
  2.67      0.70     0.02                             MAIN__ (econ1.f90:71 @ 2ea4)
  2.00      0.72     0.02                             MAIN__ (econ1.f90:77 @ 2def)
  2.00      0.73     0.02                             MAIN__ (econ1.f90:84 @ 2e3e)
  1.33      0.74     0.01                             MAIN__ (econ1.f90:78 @ 2df9)
  0.67      0.75     0.01                             MAIN__ (econ1.f90:81 @ 2ded)
  0.67      0.75     0.01                             MAIN__ (econ1.f90:76 @ 2e2a)
  0.00      0.75     0.00        1     0.00     0.00  MAIN__ (econ1.f90:6 @ 2429)

                        Call graph


granularity: each sample hit covers 2 byte(s) for 1.33% of 0.75 seconds

index % time    self  children    called     name
                0.00    0.00       1/1           main (econ1.f90:104 @ 3680) [70]
[15]     0.0    0.00    0.00       1         MAIN__ (econ1.f90:6 @ 2429) [15]
-----------------------------------------------

Index by function name

  [15] MAIN__ (econ1.f90:6 @ 2429) [2] MAIN__ (econ1.f90:76 @ 2de1) [14] MAIN__ (econ1.f90:76 @ 2e2a)
   [7] MAIN__ (econ1.f90:71 @ 2cba) [13] MAIN__ (econ1.f90:81 @ 2ded) [11] MAIN__ (econ1.f90:84 @ 2e3e)
   [5] MAIN__ (econ1.f90:73 @ 2cf5) [10] MAIN__ (econ1.f90:77 @ 2def) [6] MAIN__ (econ1.f90:85 @ 2e71)
   [3] MAIN__ (econ1.f90:74 @ 2d22) [12] MAIN__ (econ1.f90:78 @ 2df9) [9] MAIN__ (econ1.f90:71 @ 2ea4)
   [1] MAIN__ (econ1.f90:75 @ 2d76) [8] MAIN__ (econ1.f90:73 @ 2e21) [4] MAIN__ (econ1.f90:88 @ 2eb8)

coverage report. Note the lines are numbered and match up to the gprof lines

econ1.gcno:version 'B03*', prefer 'A93*'
econ1.gcda:version 'B03*', prefer version 'A93*'
        -:    0:Source:econ1.f90
        -:    0:Graph:econ1.gcno
        -:    0:Data:econ1.gcda
        -:    0:Runs:1
        -:    1:!============================================================================
        -:    2:! Name        : RBC_F90.f90
        -:    3:! Description : Basic RBC model with full depreciation
        -:    4:! Date        : July 21, 2013
        -:    5:!============================================================================
        1:    6:program RBC_F90
        -:    7:!----------------------------------------------------------------
        -:    8:! 0. variables to be defined
        -:    9:!----------------------------------------------------------------
        -:   10:implicit none
        -:   11:integer,  parameter :: dp=kind(0.0d0)
        -:   12:integer,  parameter :: nGridCapital = 17820
        -:   13:integer,  parameter :: nGridProductivity = 5
        -:   14:real(dp), parameter :: tolerance = 0.0000001_dp
        -:   15:real     :: time_begin, time_end
        -:   16:integer  :: nCapital, nCapitalNextPeriod, gridCapitalNextPeriod, nProductivity
        -:   17:integer  :: iteration
        -:   18:real(dp) :: aalpha, bbeta, capitalSteadyState, outputSteadyState, consumptionSteadyState
        -:   19:real(dp) :: valueHighSoFar, valueProvisional, consumption, capitalChoice
        -:   20:real(dp) :: maxDifference
        -:   21:real(dp), dimension(nGridProductivity)                   :: vProductivity
        -:   22:real(dp), dimension(nGridProductivity,nGridProductivity) :: mTransition
        -:   23:real(dp), dimension(nGridCapital)                        :: vGridCapital
        -:   24:real(dp), dimension(nGridCapital,nGridProductivity)      :: mOutput, mValueFunction, mValueFunctionNew, mPolicyFunction
        -:   25:real(dp), dimension(nGridCapital,nGridProductivity)      :: expectedValueFunction
        1:   26:  call CPU_TIME(time_begin)
        -:   27:  !----------------------------------------------------------------
        -:   28:  ! 1. Calibration
        -:   29:  !----------------------------------------------------------------
        1:   30:  aalpha = 0.33333333333_dp; ! Elasticity of output w.r.t
        1:   31:  bbeta  = 0.95_dp ! Discount factor
        -:   32:  ! Productivity value
        1:   33:  vProductivity = (/0.9792_dp, 0.9896_dp, 1.0000_dp, 1.0106_dp, 1.0212_dp/)
        -:   34:  ! Transition matrix
        -:   35:  mTransition = reshape( (/ &
        -:   36:   0.9727_dp,  0.0273_dp,  0._dp,      0._dp,      0._dp,      &
        -:   37:   0.0041_dp,  0.9806_dp,  0.0153_dp,  0.0_dp,     0.0_dp,     &
        -:   38:   0.0_dp,     0.0082_dp,  0.9837_dp,  0.0082_dp,  0.0_dp,     &
        -:   39:   0.0_dp,     0.0_dp,     0.0153_dp,  0.9806_dp,  0.0041_dp,  &
        1:   40:   0.0_dp,     0.0_dp,     0.0_dp,     0.0273_dp,  0.9727_dp   /),  (/5,5/))
        -:   41:  !----------------------------------------------------------------
        -:   42:  ! 2. Steady state
        -:   43:  !----------------------------------------------------------------
        1:   44:  capitalSteadyState     = (aalpha*bbeta)**(1.0_dp/(1.0_dp-aalpha))
        1:   45:  outputSteadyState      = capitalSteadyState**(aalpha)
        1:   46:  consumptionSteadyState = outputSteadyState-capitalSteadyState
        1:   47:  print *, 'Steady State values'
        1:   48:  print *, 'Output: ', outputSteadyState, 'Capital: ', capitalSteadyState, 'Consumption: ', consumptionSteadyState
        -:   49:  ! Grid for capital
    17821:   50:  do nCapital = 1, nGridCapital
    17821:   51:     vGridCapital(nCapital) = 0.5_dp*capitalSteadyState+0.00001_dp*(nCapital-1)
        -:   52:  enddo
        -:   53:  !----------------------------------------------------------------
        -:   54:  ! 3. Pre-build Output for each point in the grid
        -:   55:  !----------------------------------------------------------------
        6:   56:  do nProductivity = 1, nGridProductivity
    89106:   57:     do nCapital = 1, nGridCapital
    89105:   58:        mOutput(nCapital, nProductivity) = vProductivity(nProductivity)*(vGridCapital(nCapital)**aalpha)
        -:   59:     enddo
        -:   60:  enddo
        -:   61:  !----------------------------------------------------------------
        -:   62:  ! 4. Main Iteration
        -:   63:  !----------------------------------------------------------------
        1:   64:  maxDifference = 10.0_dp
        1:   65:  iteration     = 0
      258:   66:  do while (maxDifference>tolerance)
      257:   67:     expectedValueFunction = matmul(mValueFunction,mTransition);
     1542:   68:     do nProductivity = 1,nGridProductivity
        -:   69:        ! We start from previous choice (monotonicity of policy function)
     1285:   70:        gridCapitalNextPeriod = 1
 22900242:   71:        do nCapital = 1,nGridCapital
 22898700:   72:           valueHighSoFar = -100000.0_dp
 60486291:   73:           do nCapitalNextPeriod = gridCapitalNextPeriod,nGridCapital
 60486291:   74:              consumption = mOutput(nCapital,nProductivity)-vGridCapital(nCapitalNextPeriod) !88
 60486291:   75:              valueProvisional = (1.0_dp-bbeta)*log(consumption)+bbeta*expectedValueFunction(nCapitalNextPeriod,nProductivity) !89
60486291*:   76:              if (valueProvisional>valueHighSoFar) then ! we break when we have achieved the max !90
 37587591:   77:                 valueHighSoFar        = valueProvisional
 37587591:   78:                 capitalChoice         = vGridCapital(nCapitalNextPeriod)
 37587591:   79:                 gridCapitalNextPeriod = nCapitalNextPeriod
        -:   80:              else
 22898700:   81:                 exit
        -:   82:              endif
        -:   83:           enddo
 22898700:   84:           mValueFunctionNew(nCapital,nProductivity) = valueHighSoFar
 22899985:   85:           mPolicyFunction(nCapital,nProductivity)   = capitalChoice
        -:   86:        enddo
        -:   87:     enddo
22900242*:   88:     maxDifference = maxval((abs(mValueFunctionNew-mValueFunction)))
      257:   89:     mValueFunction = mValueFunctionNew
      257:   90:     iteration = iteration+1
      257:   91:     if (mod(iteration,10)==0 .OR. iteration==1) then
       26:   92:        print *, 'Iteration:', iteration, 'Sup Diff:', MaxDifference
        -:   93:     endif
        -:   94:  enddo
        -:   95:  !----------------------------------------------------------------
        -:   96:  ! 5. PRINT RESULTS
        -:   97:  !----------------------------------------------------------------
        1:   98:  print *, 'Iteration:', iteration, 'Sup Diff:', MaxDifference
        1:   99:  print *, ' '
        1:  100:  print *, 'My check:', mPolicyFunction(1000,3)
        1:  101:  print *, ' '
        1:  102:  call CPU_TIME (time_end)
        1:  103:  print *, 'Elapsed time is ', time_end - time_begin
        1:  104:end program RBC_F90

I do find it a bit depressing that the, apparently respected, author of a textbook for no obvious reason except perhaps ignorance doesn’t use Standard Fortran.

If the author had called it “Introduction to Computational Fortran using a language similar to Fortran but non-standard” that might be been acceptable, but otherwise it is not. Don’t textbooks go through any form of review process any more?