Hi,
When using OpenMP, if I want to ensure that a given thread is attached to the same physical core for the whole program life, it seems that setting the environment variable OMP_PROC_BIND
to true
is enough. Is that correct?
Now, I have a machine with 2 CPUs, each one having 16 cores. When running a program with 32 threads, do I have to specify something special (possibly with OMP_PLACES
?) to ensure that the the threads 0-15 are attached to the first CPU and the threads 16-31 to the second CPU, or is it granted by default?
Another case: on this 32 cores machines, I want to use only 16 threads and be sure that they are all on the same CPU. I understand that I should set OMP_PLACES
to sockets
: is that correct?
And finally, about the allocations: in a NUMA scheme, I assume that the memory allocated within a given thread is as far as possible placed on the physical RAM that is “attached” to the CPU where the thread is running. And that it happens at the “first touch” and not necessarily when “allocate” is executed (at least on the main current OS’s, which do “lazy allocations”). Is that correct ?