Why is my code compiled with GFortran on Windows slower than on Ubuntu?

I have now tried using the gfortran profiling utility : gprof
This is my first use of gprof, so my option selection may not be the best, but I get a useful table.
It is much easier than my approach.

The batch file I used to test with the profiling report is:

del *.o
del *.mod
del em_mix.exe

set options=-g -fimplicit-none -fallow-argument-mismatch -march=native -pg

gfortran ran.f90 samplers.f90 em_mix.f90 %options% -o em_mix.exe

dir *.exe

em_mix

gprof -b -J -p em_mix.exe > em_mix_profile.log

notepad em_mix_profile.log

notepad em_mix.log

The resulting profile log I recovered is useful

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 56.00      1.82     1.82                             exp
 26.46      2.68     0.86 10198404     0.00     0.00  __samplers_MOD_pyq_i_detail
  3.38      2.79     0.11                             __logl_internal
  3.08      2.89     0.10    10200     0.01     0.09  __samplers_MOD_mc_gauss_ptheta_w_sig
  2.15      2.96     0.07                             __fentry__
  2.15      3.03     0.07                             _mcount_private
  1.54      3.08     0.05  1000000     0.00     0.00  __samplers_MOD_pyq_more_o_log
  1.54      3.13     0.05      100     0.50     1.20  __samplers_MOD_metroplis_gik_k_more_o_log
  0.92      3.16     0.03  6040500     0.00     0.00  __random2_MOD_ran1
  0.62      3.18     0.02                             __cosl_internal
  0.62      3.20     0.02                             log
  0.31      3.21     0.01  4000008     0.00     0.00  __random_MOD_randn
  0.31      3.22     0.01      208     0.05     0.05  __random_MOD_gaussian
  0.31      3.23     0.01                             __sinl_internal
  0.31      3.24     0.01                             cos
  0.31      3.25     0.01                             sin
  0.00      3.25     0.00    10660     0.00     0.00  delta_sec_
  0.00      3.25     0.00      102     0.00     0.10  __samplers_MOD_gauss_thetas
  0.00      3.25     0.00       90     0.00     0.00  __samplers_MOD_corrchk_internal
  0.00      3.25     0.00       51     0.00    19.02  __samplers_MOD_prep
  0.00      3.25     0.00       50     0.00     0.40  __samplers_MOD_metroplis_gik_all_o_log
  0.00      3.25     0.00       50     0.00    21.82  __samplers_MOD_steptest
  0.00      3.25     0.00        9     0.00     0.00  __samplers_MOD_get_musigma
  0.00      3.25     0.00        1     0.00     0.00  __random_MOD_savern
  0.00      3.25     0.00        1     0.00     0.00  __random_MOD_setrn
  0.00      3.25     0.00        1     0.00     0.00  __samplers_MOD_get_datetime
  0.00      3.25     0.00        1     0.00     0.00  __samplers_MOD_get_musigma_maxll
  0.00      3.25     0.00        1     0.00     0.00  __samplers_MOD_push_yji
  0.00      3.25     0.00        1     0.00     0.00  __samplers_MOD_samplers_init

This clearly identifies the exp intrinsic and function pyq_i_detail as the main time usage in “self seconds”
The “cumulative seconds” is 3.25 seconds, which is less than 5.07 seconds I have obtained from SYSTEM_CLOCK, but is hopefully explained in the documentation.
It does not provide call counts for intrinsics exp, log, sin and cos

gprof is a very easy way to identify where time is being spent.
It would be useful to perform this test on the range of gfortran implementations you have available.

2 Likes