Why a function returns an array is much slower than a subroutine returns an array? (real MWE included)

By the way, does gfortran has things like
-heap-arrays ?
Or, does gfortran automatically heap arrays?

It looks like I never need to do anything and gfortran never stackoverflow.
I usually just use gfortra with -Ofast or -O3 with march=native, and it works fine.