Thanks to @dwwork and @themos for their lucid explanations. I have had the same questions as the OP for some time. I found a post What scientists must know about hardware to write fast code by BioJulia Developer Jakob Nybo Nissen:
Inside RAM, data is kept on either the stack or the heap . The stack is a simple data structure with a beginning and end, similar to a
Vectorin Julia. The stack can only be modified by adding or subtracting elements from the end, analogous to aVectorwith only the two mutating operationspush!andpop!. These operations on the stack are very fast. When we talk about “allocations”, however, we talk about data on the heap. Unlike the stack, the heap has an unlimited size (well, it has the size of your computer’s RAM), and can be modified arbitrarily, deleting any objects.Intuitively, it may seem obvious that all objects need to be placed in RAM, must be able to be retrieved and deleted at any time by the program, and therefore need to be allocated on the heap. And for some languages, like Python, this is true. However, this is not true in Julia and other efficient, compiled languages. Integers, for example, can often be placed on the stack.
Why do some objects need to be heap allocated, while others can be stack allocated? To be stack-allocated, the compiler needs to know for certain that:
- The object is a reasonably small size, so it fits on the stack. This is needed for technical reasons for the stack to operate.
- The compiler can predict exactly when it needs to add and destroy the object so it can destroy it by simply popping the stack (similar to calling
pop!on aVector). This is usually the case for local variables in compiled languages.