Stack and Heap

Memory Allocations
Stack
Heap
Performance Differences
Arrays
Which one do I use?

Memory Allocations

As you know, memory management in C++ is a manual processing. The programmer is responsible for allocating and de-allocating all memory that is used within the program. However, when allocating memory, we tend to allocate it from one of two different regions of memory that is assigned to our program.

Stack

The stack is a 1MB block of memory that is allocated for each program on launch¹. Objects and memory that are allocated on the stack have the lifetime of the current scope. This means that objects allocated on the stack are automatically destructed and de-allocated when the scope ends.

Stack Frame

Whenever a new scope is begun, the current stack address is stored and a new stack frame begins. When the scope exists, the current stack address is restored to the address cached at the start of the scope. When a stack frame is destroyed, all objects that are part of the stack frame their destructor is automatically invoked. The memory is then de-allocated for future use²

// New stack frame is pushed on new scope
{
	// Allocate an object on the stack
	MyObject a( myArgs );
}
// MyObject::~MyObject is invoked on `a` and stack frame is cleaned up
// `a` no longer exists at this point

There are many different parts of a program that have their own scope, and as a result their own stack frame:

Function Body
Anonymous Scope
Function Arguments
Loops expression
Loops body
…

void MyFunction( int a )
{ // Scope 0 begin
	for( int i = 0; i < a; ++i ) // Scope 1 begin
	{ // Scope 2 begin - each iteration
		std::cout << i << std::endl;
	} // Scope 2 end - each iteration
	// Scope 1 end
} // Scope 0 end

// The literal 5 is allocated on the stack to be passed to the function.
// The scope is pushed just before the function is invoked and destroyed
// after the function returned.
MyFunction( 5 );

Stack Overflow

When the program attempts to allocate too much memory from the stack, it is said that a stack overflow³ occurs. Examples of scenarios where this can trigger is when allocating too many objects onto the stack (e.g. a local array) or for instance when a recursive function is invoked too many times (e.g. the function fails to have an end condition).

Object Size

When allocating objects on the stack, their size has to be known at compile time. There are ways to dynamically allocate stack memory through the use of the alloca function, but lots of care has to be taken as mis-using this memory can have severe consequences, with crashing being the best case.

More details on this in the Array section.

Heap

When the Stack is a “fixed-size” block of memory assigned to the program, the Heap is effectively the remainder of all other CPU-accessible memory in the system. The maximum amount of memory that can be allocated from the heap is limited by a wide range of factors, such as architecture (32-bit vs 64-bit) and system memory avaible (“RAM”). However, given that most systems we target for game development tend to have 8GB of system memory (excluding mobile and very low spec machines), this means that the heap has got ~8GB of memory available. However, the heap memory is shared between all processes within a system and is not just assigned to your program.⁴

All memory that is allocated from the heap is always returned as a pointer and is not automatically managed by the runtime. It is the program’s responsibility to free⁵ all of the memory that has been allocated from the heap.

// Allocate an object from the heap
MyObject* a = new MyObject( myArgs );

...

// Destroy the object and free the memory
delete a;

Memory Leaks

When the program allocates memory from the heap, it also has to free it. If the program does not free the memory, it is said that a memory leak has occured.

Memory leaks are often misunderstood. Technically speaking, when a program exists, you have to free all of the memory. However, the operating system will free the memory that is used by the program on exit, meaning that even though your program technically has a memory leak, in practice it won’t actually make any difference. It is always considered good practice to clean up all of your memory before exit, as this guarantees cleaner code paths and helps keep an easier to follow flow of the program, as well as the code being less error-prone on future refactors.

What we actually mean when we talk about memory leaks, is a program that continues to use more and more memory during execution without freeing the memory. Common cases for this are constant object allocation without freeing them after use, or creating lists that keep on having more and more elements added to them without ever being trimmed. These patterns will cause the program’s memory footprint to increase over time as the program is running, until eventually the system potentially runs out of memory.

// Function is invoked once or more each frame
void ExecuteEachFrame()
{
	MyObject* newObject = new MyObject( myArgs );
	DoSomething( newObject );
	// We never delete the object, meaning that we allocate memory each frame
	// that is never recycled. This is a memory leak. We should be
	// calling `delete newObject` here.
}

A better solution for the example above is as follows:

// Function is invoked once or more each frame
void ExecuteEachFrame()
{
	// Allocate the object on the stack and pass the
	// address of the object to the function
	MyObject newObject( myArgs );
	DoSomething( &newObject );
	// `newObject` is automatically destroyed as the
	// scope of the function exists, freeing the memory
}

`new`/`delete` vs `malloc`/`free`

There is a difference between calling new and malloc when allocating memory from the heap. The new operator is the C++ version of allocating memory, while malloc is the C function for allocating memory.

The new operator first allocates a block of memory, where the size and alignment of the memory block is determined by the class for which we are invoking the new operator of and then invokes the matching constructor of the class and passes the arguments to the constructor. Similarly, the delete operator first invokes the destructor of the object and then frees the memory that was allocated for said object.

MyObject* a = new MyObject( myArgs );
...
delete a;

When using malloc, we just allocate a block of memory that isn’t associated with any object type or alignment. As a result, malloc always returns a void*, which points to the address in the heap of the memory block allocated. The same idea holds for free, which simply de-allocates the memory without invoking the destructor.

void* memoryBlock = malloc( sizeof( MyObject ) );
MyObject* a = reinterpret_cast< MyObject* >( memoryBlock );
a->MyObject( myArgs );
...
a->~MyObject();
free( a ); // MyObject* is implicitly cast to void*

In practice, there is no difference between the two code examples. The new and delete operator are effectively just syntactic-sugar for the code in the second example. However, new often is safer to use because it’s easier to get allignment rules wrong when using the malloc/free approach of allocating and de-allocating memory.

Because of the fact that the new and delete operator are just syntactic sugar, you can mix the usage of the two methods of allocating and freeing memory, for instance allocate using malloc approach and free using delete approach. This is often done when dealing with custom memory allocators, which is a common practice in games due to the performance cost associated with memory allocations⁶.

Placement `new`

Sometimes we want to use the newer new operator but want to invoke it on a block of memory that we have pre-allocated. This can be done with the code shown in the malloc example, but this code is quite verbose. As a result, an overload for the new operator was introduced that allows us to specify the memory location of where we want to construct the object. This is called placement new.

void* memoryBlock = malloc( sizeof( MyObject ) );
MyObject* a = new ( memoryBlock ) MyObject( myArgs );

The pointer to the memory location where we want to specify is constructed as an extra argument between paranthesis before the object constructor. Using this syntax allows the code to be more consice and is considered the more modern approach to allocating in-place.

Performance Differences

At the core of it, there is no immediate performance difference between stack and heap memory. However, stack and heap memory have got different memory allocation patterns and guarantees which can result in significant performance differences if the programmer isn’t careful.

Memory performance is a complicated beast and accessing memory (both read and write) can have very different latency depending on the access patterns. Explaining why and how is outside of the scope of this document. The main thing to remember is that accessing different locations in memory that are close together in memory significantly faster⁷ than when accessing memory from random locations.

When we allocate memory from the stack, because it is all allocated from the same 1MB block, the memory tends to be heavily localised. Allocating two objects in a row are by design guaranteed to be close together in memory. As a result, accessing memory that is allocated on the stack is almost always very fast.

In contrast to this, when we allocate memory from the heap, the memory can come from pretty much anywhere within the system memory. When we allocate two objects in a row, the operating system provides zero guarantees that the objects will be close together in memory, and more often than not they won’t be. Accessing objects that are allocated from the heap at different locations is often called pointer jumping, as we jump all over memory when accessing them.

Arrays

As objects allocating on the stack their size have to be known at compile time, this means that we can only allocate arrays on the stack if the size is a compile time constant.

int a[ 500 ]; 			// compiles ok
int count = 1 + (rand() % 499); // random value between 1 and 500
int b[ count ]; 		// error C2131: expression did not evaluate to a constant

If you want to allocate an array of which the size is determined at runtime, you have to use the heap to allocate the array. There are special cases of the new and delete operator which are specifically designed for arrays, which are the new[] and delete[] operator. As arrays are just a pointer to the first element of the array, the syntax for arrays allocated on the heap are just a pointer.

int count = 1 + (rand() % 499); // random value between 1 and 500
int* b = new int[ count ]; 	// compiles ok
...
delete[] b;

As an array is just a pointer, there is nothing stopping you from invoking delete instead of delete[] on a heap allocated array. However, this can lead to undefined behaviour, so care should be taken to use the correct operator.

Which one do I use?

So with these different types of memory and allocation methods, which one do I use in which case?

In general, when an object has a short lifetime, such as only needs to be alive for the duration of the function scope, it is better to use the stack. If an object needs to exist for longer than the scope in which it was created, such as across frames, we need to use the heap. However, even if the lifetime of an object is withint a single function, if the object will be quite large in size (e.g. an array holding the data of an entire file read from disk), we want to use the heap as we only have 1MB of stack memory available and running out will cause a crash.

The stack is actually local per thread and it’s size is specified at thread creation time. However, each program starts with a single thread with a default of 1MB of stack space. ↩
Technically speaking the memory isn’t de-allocated. Because the stack address pointer is reset, the memory is mearly marked for re-use. ↩
Yes, this is where the name of the famous programming message board Stack Overflow comes from. ↩
On most systems, memory can be over-committed and doesn’t always have to be backed by physical memory, meaning that multiple programs have access to a large range of memory. ↩
Free is the term used when de-allocating memory. It comes from the standard C-function free. ↩
Allocating memory from the heap is considered very slow and often one of the biggest bottlenecks of a frame in a game. ↩
When we are saying significantly slower, we are talking order of magnitudes of 100-200 times slower or even slower. The actual numbers are heavily hardware dependent. ↩

Last modified on Tuesday 01 February 2022 at 17:38:02