I am trying to compile a code using gfortran instead of ifortran. The gfortran code runs much much slower than the ifortran code. It does not even use all threads from my computer. I have 40 cores and 80 threads. ifort uses all of them.I think it has to do with the ifort flag /Qm64 but can’t be sure (using the /Qm32 flag on ifort is slower and does not use all the threads on openmp - not sure why).
This is the way that visual studio seems to be compiling my code:
The first iteration of the ifort loop runs in 19.9 seconds with 100% CPU usage, the first iteration of the gfort loop runs in 38.8 seconds (almost double) with 50% CPU usage.
If I compile my ifort code with \qm32 flag it gets similar performance to the gfortran.
Overall, in many cases from my experience, on Linux, ifort and gfortran have similar performance. ifort usually 10-20% faster, but the difference are not very big. However, it seems on windows, with the same flags, gfortran can be 7 times slower than its performance on Linux. Intel’s performance is consistent on windows and Linux.
You may begin with the following flags.
Both can be used with perhaps one single flag,
I am not gfortran expert. Other people’s opinion may be more useful.
But it looks like you used openMP. The speed difference seems is simply caused by the fact that as you said, gfortran only uses half of the threads as ifort did.
If in openMP you explicitly specify the number of threads you want to use, does that help gfortran to make full use of all the threads? But again, I am not expert in gfortran, other people may give you much better answer and solve the puzzle.
By the way, the -m64 flag in gfortran may be not necessary. -O3 -march=native may be enough to begin with, in many cases.
Uhm, may I ask, by performance boost using intel MKL, do you mean using the function’s provided by Intel MKL could give a performance boost or something?
The only reason I did not dig too much into MKL is that, I feel if I use a lot of MKL exclusive function/subroutines, then my code will perhaps be intel Fortran exclusive. So it will not compile on both gfortran and ifort, therefore may not be generic enough.
Or like, do you mean, like, some same lapack subroutines, like you know dgesv, dgeev, or something, by linking MKL there will be performance boost than not linking MKL? Just curious. Again, I guess I am neither expert in ifort or gfortran, just curious. Thanks!