Optimization in gfortran vs pgf95

I’m running some stellar atmosphere code on a Mac. When the code is compiled with the pgf95 compiler using the flags -mp -Mlarge_arrays -O2 -fast -Mconcur
it takes about 3 hours to run.

When I compile the same code using gfortran and the switches
-O2 -fopenmp -m64
it takes about 5 hours to run.

In both cases I’m running nominally on 4 cores, although the gfortran compilation may not
be fully utilizing them based on what I see using “top”.

I’d appreciate any suggestions. The “-fast” in pgf95 seems to do a lot of good things, but I can’t seem to identify anything analogous in gfortran. Thanks for any advice.

1 Like

gfortran has an -ofast option:

Disregard strict standards compliance. -Ofast enables all -O3 optimizations. It also enables optimizations that are not valid for all standard-compliant programs. It turns on -ffast-math, -fallow-store-data-races and the Fortran-specific -fstack-arrays, unless -fmax-stack-var-size is specified, and -fno-protect-parens. It turns off -fsemantic-interposition.

Edit: looks like they are different though and -fast in pgf95 may be about enabling simd vectorisation? Which I think would be -march=native in gfortran

1 Like