Why is my code compiled with GFortran on Windows slower than on Ubuntu?

I really appreciate your endeavor @JohnCampbell ! Thank you so much!

About do concurrent, I agree.
I use it for the hope that it can really do some parallelization automatically, and perhaps it can make things work in GPU. But intel’s compiler seems have some issue with it, here is a post about the issue and you also replied there :slight_smile:

I personally did not find too much performance advantage of do concurrent, other than it can make the code look more concise perhaps.

Thank you for being so careful :+1: :100: “mgauss_ik” yeah it is just to dynamically adjust the number of samples (for the given i,k) used for Monte Carlo integral like below,


where n_ik is actually line 136 in samplers.f90,

“mgauss_ik” is actually not very useful, can just comment line 221 to 224 in samplers.f90 as below,

and just do

mgauss_ik = mgauss

so mgauss_ik will always be a constant which is mgauss. So for each n_ik the number of Monte Carlo samples are the same as mgauss which is typically 1000.
The reason for “mgauss_ik” is that, say k=2 so 2 gaussian mixing, the total number samples for n_i1 and n_i2 is a fixed number, which is k*mgauss, if mgauss=1000 and k=2, so k*mgauss=2000. However perhaps n_i1 needs more samples than n_i2, so I may distribute 1500 samples on n_i1, and 500 on n_i2. So “mgauss_i1=1500”, “mgauss_i2=500”, etc. In this way, the total 2000 samples are more efficient distributed on n_i1 and n_i2, instead of just giving 1000 samples for each.
No worry, in short, “mgauss_ik” does not really influence the code and not depend on seed too much. You know, if the result of a Monte Carlo simulation heavily depend on random number seed, then something must be wrong :rofl:

By the way, how did you get the profile information below?

#### Delta_Sec Summary ####   12

 Id Description                      Elapsed    Calls
  1 _START                            0.0000        1
  2 # pYq_i_detail                    3.5952    10201
  3 INITIALISED Yji                   0.0006        1
  4 prep > gauss_thetas               0.0961      102
  5 prep > MC_gauss_ptheta_w_sig      0.0000      102
  6 Metroplis_gik_k_more_o_log        0.4669       50
  7 CC Metroplis_gik_all_o_log        0.0679       50
  8 CC mgauss_ik(i,k)                 0.0006       50
  9 steptest report                   0.0095       50
 10 cpu_time report                   0.0011       50
 11 ANALYSED                          0.0026        1
 12 _FINISHED                         4.2406    10659
  calls to pYq_i_detail =             10198404
 Program end normally.

I tried gprof on windows, but it always generate empty prof file, perhaps I will open a new topic asking this question.

Again, thank you so much! :+1: :100: :slight_smile: