I am wondering whether somebody has experienced a similar issue:
this code
program test
use iso_fortran_env
implicit none
integer (int64) :: n=100000000,i
integer(int64), allocatable :: a(:)
a=(/(i,i=1,n)/)
write(*,*) "done"
end program test
when compiled with ifort version 2021.2.0 either on arch linux, kernel version 6.0.6 (host system), or ubuntu 22.04.1, kernel version 5.15.0-48 (virtualbox guest system), with the host having chip set Intel(R) Core™ i9-10980XE … the code runs on both machines.
When the exe (NOT THE CODE) was copied to a machine with OS ubuntu 18.04-1, kernel version 5.4.0-89, which operates on an AMD machine with AMD EPYC 7742 64-Core Processor, it segfaults immediately.
I could reproduce the problem on a Virtualbox ubuntu 18.04 machine running on the same host as above. When the code compiled under arch or ubuntu 22.04 is run on 18.04 it immediately segfaults.
Could it be that memory limits are set differently on the different systems you are running on? Here’s an article which shows how you can use ulimit to check this: Restricting program memory | There is no magic here
Same problem on Linux Minit 20.3 with AMD Zen 3 processor and ifort 2021.7. However, setting the stack size to unlimited prior to execution fixes the problem ie
ulimit -s unlimited
Not sure why this effects the Ubuntu based distros. I thought allocatable arrays always were put on the heap not the stack.
Note NVIDIA fortran doesn’t appear to have this problem so it looks like an Intel ifort issue. Probably sees the AMD processor and decides to not optimize something it normally would for an Intel processor.
However, I just figured out that one of our clients has deleted the delivered .bashrc … and checking the bashrc was the last thing I was thinking about … and of course, when I just adhoc installed a ubuntu 18.04 guest I forgot to adjust the bashrc … (headbang) …
After making the above changes everything seems to run
Thanks for pointing me to the trees while I was looking into the forest
Yes, I thought it was due to a temp array after my last post. what is interesting is that ifort (and ifx too) aren’t able to optimize the temp array out. Latest NVIDIA nvfortran doesn’t have this problem so I’m guessing it either doesn’t use a temp array or can optimize it out.
A compiler can do this in many different ways. One way would be to create the RHS array at compile time, and then assign it to the allocatable array a(:) at run time. If your compiler does this, then the executable will be about 800MB in size. Another way would be to create the RHS at runtime. The executable will be just a few KB in this case. If it creates the RHS on the stack, then it could run into the stack limit, as others have discussed already. Another way would be to compile the code as if it were written
allocate(a(n))
do i = 1, n
a(i) = i
enddo
In this case, there would be a small executable size and no stack size issues. Another way would be to optimize away both the RHS array and the assignment as dead code since a(:) is never referenced. In this case, there would be a small executable and no stack size issues.
In an actual program, I would write it as above, with the allocate statement and the do loop. That way the success of the code would not depend on how clever is the compiler optimization.
This option puts automatic arrays and arrays created for temporary computations on the heap instead of the
stack.
When this option is specified, automatic (temporary) arrays that have a compile-time size greater than the
value specified for size are put on the heap, rather than on the stack. If the compiler cannot determine the
size at compile time, it always puts the automatic array on the heap.