Does the CPU support hyper-threading? I assume it only has 2 NUMA domains (1 for each socket)?
I think the general answer is no, if you don’t specify, it is implementation-defined. You can check what occurs in practice with OMP_DISPLAY_AFFINITY=1.
For instance, in the libgomp docs they say threads can be moved between CPUs:
If
OMP_PLACESandGOMP_CPU_AFFINITYare unset andOMP_PROC_BINDis either unset orfalse, threads may be moved between CPUs following no placement policy.
Moreover, the libgomp the docs state:
When undefined,
OMP_PROC_BINDdefaults toTRUEwhenOMP_PLACESorGOMP_CPU_AFFINITYis set andFALSEotherwise.
So setting OMP_PLACES=sockets will automatically bind/pin the OpenMP threads when using libgomp (but not necessarily in libiomp).
For libiomp, the thread affinity interface is documented here. The OpenMP variables are documented here.
By default, the thread binding is set to FALSE, which is equivalent to KMP_AFFINITY=none which means:
Does not bind OpenMP threads to particular thread contexts; however, if the operating system supports affinity, the compiler still uses the OpenMP thread affinity interface to determine machine topology.
I don’t understand the implications of the second part of the sentence.
Especially on HPC systems you need to be careful, because sometimes the variables are already set by the admins/modules. I’ve encountered an environment where KMP_ variables where set, and took precedence over OMP_ variables. To use the OpenMP interface, I had to explicitly unset the KMP_AFFINITY variable. If not set correctly, it’s easy to slow down everything by mixing up MPI ranks and OpenMP threads in ways that don’t respect the machine topology or clash at the software level.
Probably not relevant to your question, but on MacOS I’m not sure OpenMP thread affinity is supported at all. The OS kernel passes work to threads as it deems best and the affinity variables are ignored. At least that’s what I’ve inferred from the following threads:
- macos - How to set processor affinity on OS X? - Super User
- parallel processing - How can I get an OpenMP program to use multiple cores on macOS - Stack Overflow
- Thread affinitization: pinning Julia threads to cores - General Usage - Julia Programming Language
- [Apple M1] Force an execution on a specif… - Apple Community
A while ago I did some experiments with XCode Instruments (the native Mac profiler), and one can see the thread switching which takes place.