Why use DO CONCURRENT ?
What does it imply ? (I am not sure)
Potentially (depending on the compiler) it could initialise some MPI interfaces, which would be totally unnecessary. All you need is DO.
If you put it in the code, then the next person to maintain the code will have to answer these questions.
Your code sample also identifies “mgauss_ik”, which I presume modifies the loop count depending on the data set (of different random numbers). This could change the calculation extent, based on the use of RANDOM_NUMBER which differs between compilers.
I looked at function pYq_i_detail (90% of computation in my profiling), as it did not appear to respond to -ffast-math or -O. I suspect it does not utilise AVX, so I tried to introduce array instructions to see if it worked any better. It did not
Handling of IEEE exceptions is very compiler dependent. I have found “equation.com”:gfortran to be slow for this. These are more often to occur in performance tests, rather than real data. I wasted months on this issue until mecej4 identified this problem for me. You need sufficiently realistic data sets for testing to avoid these unwanted side issues.
I introduced some profiling into the code, and produced the following:
#### Delta_Sec Summary #### 12
Id Description Elapsed Calls
1 _START 0.0000 1
2 # pYq_i_detail 3.5952 10201
3 INITIALISED Yji 0.0006 1
4 prep > gauss_thetas 0.0961 102
5 prep > MC_gauss_ptheta_w_sig 0.0000 102
6 Metroplis_gik_k_more_o_log 0.4669 50
7 CC Metroplis_gik_all_o_log 0.0679 50
8 CC mgauss_ik(i,k) 0.0006 50
9 steptest report 0.0095 50
10 cpu_time report 0.0011 50
11 ANALYSED 0.0026 1
12 _FINISHED 4.2406 10659
calls to pYq_i_detail = 10198404
Program end normally.
note: '# pYq_i_detail is reported at exit from subroutine MC_gauss_ptheta_w_sig; called from subroutine prep, where m = mgauss_ik(i,k). This count is significant for the comparison.
(Times are for i5-2300)