C++ Beyond the Syllabus #4: RAII & Smart Pointers
Say goodbye to memory leaks.
Say goodbye to memory leaks… and say hello to your new friends, RAII and smart pointers!
The new
and delete
keywords are taught in 100-level Intro to CS courses. They are fundamental building blocks of C++ programming, but are rarely ever used in industry.
When it comes to real world C++ development, RAII and smart pointers are essential for code cleanliness, readability, and maintainability. Not to mention, they will drastically decrease your chances of memory leaks.
Let’s jump into it.
Raw Pointers Aren’t Cutting It
As a reminder, the new
and delete
keywords help programmers interact with memory. Without these keywords objects are (generally) allocated on the stack. When paired with new
an object will be allocated on the heap. This new
operator returns a raw pointer to the created object:
MyObject *object_ptr = new MyObject();
Objects created on the heap do not automatically go out of scope and get destructed like objects on the stack do. You have to manually apply the delete
keyword when you are done using the heap-allocated object:
delete object_ptr;
For arrays, you’ll need the bracket operator []
along with the new
and delete
keywords.
There are so many ways this can go wrong:
- If you forget the
delete
keyword, you’ll end up with memory leaks, which can make your program slower, crash with cryptic errors, and/or exhibit undefined behavior. - If you double-
delete
, you’ll end up with undefined behavior (usually your program crashing). - Conditional code paths can require a
delete
at the end of each path. - After deleting a pointer, you’re left with a dangling pointer unless you manually set it to
nullptr
. - If an exception is thrown between
new
anddelete
, the latter will not be reached.
RAII is the solution.
What’s RAII?
RAII stands for Resource Acquisition Is Initialization. While that might seem a bit ambiguous at first, it’s actually quite a simple concept.
There are many instances when an object’s initialization and cleanup need to be performed upon the creation and destruction of the object.
If you’re thinking something like “Wait, isn’t that what constructors and destructors are for?”, then you’re catching on quickly.
RAII is simply a design idiom where resource management is tied to the lifetime of objects. In this approach, we create a wrapper object for some resource (usually memory, file handles, network connections, etc). This wrapper object’s…
- constructor acquires and initializes the resource, and
- destructor cleans up and releases the resource.
This makes the clean up and release of the resource automatic when the wrapper object goes out of scope.
Let’s build a super simple RAII wrapper to help manage heap-allocated objects by abstracting away the new
and delete
keywords…
With this example, memory management is almost trivial.
When execution returns from the main
function, SimpleResourceWrapper
is destructed, which will invoke the delete
operator on the MyObject
pointer. Aside from the implementation of this wrapper, there should be no need to manually use the delete
keyword in our code.
Something doesn’t quite seem right with the above implementation though… Why do we still have the new
keyword in our usage of the RAII wrapper? Shouldn’t that be abstracted away as well?
We’ll address this in the “Unique Pointer” section below.
The good news is if you see the value of SimpleResourceWrapper
, you pretty much already understand the value proposition of smart pointers.
Smart Pointers
There are three types of smart pointers: unique, shared, and weak.
Each type is a memory managing RAII wrapper around pointers that provides some assurances about how it will manage the memory and pointer lifetime.
The smart pointer library can be included via #include <memory>
.
Unique Pointer
This is the most intuitive smart pointer. It is essentially a glorified version of our SimpleResourceWrapper
discussed above.
When std::unique_ptr
goes out of scope, its destructor applies the delete
operator to the memory it manages.
Alluded to by its name, std::unique_ptr
provides unique ownership of the memory. This means with correct usage, there will only ever be one smart pointer managing the underlying raw pointer. Because of that, unique pointers cannot be copied; however, ownership of a managed raw pointer can be transferred to another unique pointer.
So when should you use a unique pointer?
Unless you need multiple references to the same dynamically allocated object, you should use a unique pointer. This will enforce that you don’t accidentally make copies of the pointer & ensure that delete
is properly applied when the pointer goes out of scope.
Here are a few example use cases…
Let’s say you only need a managed object from within the scope it is created (and any helper functions called from that scope):
In the above example, HelperFunction
takes in a reference to a unique pointer. This is important!
If the function parameter was not a reference, it would require the unique pointer copy constructor. As discussed above, though, copies are not allowed on unique pointers (the copy constructor is deleted!), so it would cause a compiler error.
I did mention above that ownership can be transferred, when might we want to do that? There are two ways that unique pointer ownership can be transferred. (Both of these are rather involved topics, which I’ll be writing posts on in the coming weeks. For now, I’ve linked external resources and provided a TLDR for each.)
- Return Value Optimization (RVO) (a type of copy elision)
- Move semantics
RVO is a compiler optimization used throughout the C++ language, not just with unique pointers. Simply put, when a function creates and returns an object, the object will be constructed directly in the final location where the return value is used, rather than creating and copying a temporary object.
Move semantics (usage of std::move
) instruct the compiler to permanently transfer ownership of some object stored on the heap. In the context of unique pointers, move semantics can be used like this:
In the above example, DriverFunction
transfers ownership of object_ptr1
to HelperFunction
’s object_ptr2
without copying the underlying instance of MyObject
. Effectively, this means that object_ptr1
will no longer be responsible for applying the delete
operator on the underlying raw pointer when it goes out of scope; that responsibility now falls on object_ptr2
.
Above, I mentioned we’d address how to get rid of the new
keyword.
std::unique_ptr
has a factory function, std::make_unique
, which safely and efficiently constructs instances of std::unique_ptr
without needing the new
keyword. You can use this factory function in either of the ways below:
std::unique_ptr<MyObject> object_ptr = std::make_unique<MyObject>();
or
std::unique_ptr<MyObjectWithArgs> object_ptr =
std::make_unique<MyObjectWithArgs>(arg1, arg2);
How std::make_unique
works under the hood is beyond the scope of this article. I’ll be circling back to this in the coming weeks though, after deep dives on a handful of prerequisites like RVO, move semantics, variadic arguments, and perfect forwarding.
Now that you’re familiar with std::unique_ptr
, shared and weak pointers will be a breeze.
Shared Pointer
Shared pointers behave very similarly to unique pointers, except without the assurance of exclusivity.
Multiple shared pointers can be created to refer to the same managed object (usually via the std::shared_ptr
copy constructor or copy assignment operator). The delete
operator is applied once all shared pointers referring to the same object go out of scope.
Let’s jump right into a code example…
Notice how we use the std::make_shared
factory function for shared pointers.
This code will behave the same as a unique pointer in terms of the managed object’s lifetime. The difference is that shared pointers store a reference count along with the managed object. When we first create the shared pointer above, it will have a reference count of one. Our memory diagram would look something like this.
Let’s write a more complex code snippet…
The following chain of events occurs relating to shared pointers in the above example:
node
is created inSomeFunction
on line 20.
- The
ContainerObject::SetPtr
call on line 22 invokes the shared pointer copy constructor on line 11. This increases the reference count of the underlying managedNode
object to 2.
std::move
inContainerObject::SetPtr
will transfer ownership fromarg_node
tocontainer_node_
.arg_node
is still within scope and is set tonullptr
. This operation will not change the reference count.
- When execution returns from
Container::SetPtr
toSomeFunction
,arg_node
goes out of scope. This does not affect the managedNode
object, as the shared pointer was no longer managingNode
.
- When execution returns from
SomeFunction
tomain
, the originalnode
shared pointer (created on line 20) goes out of scope. This decreases the reference count to 1.
- Lastly,
main
returns, destructing its local variablecontainer
. This causescontainer_node_
to go out of scope. Whencontainer_node_
is destructed, the shared pointer’s reference count becomes 0, and thedelete
operator is applied to the managedNode
object.
At this point, you might be wondering “why don’t we always use shared pointers, as they behave the same as unique pointers if we never make copies of it?”
Well, two reasons:
std::shared_ptr
has some performance overhead because of the reference count. Every time a shared pointer is constructed or destructed, code execution has to synchronize with any other threads in your program to atomically update the reference count. This synchronization cost can add up fast in a busy application.- Performance aside, if a resource should logically only have one owner (like a database or network connection, for example), then representing it with a shared pointer is misleading and can allow unknowing developers to introduce bugs to your application. Representing uniquely owned resources with unique pointers is more clear.
Weak Pointer
Weak pointers are used in conjunction with shared pointers.
TLDR — Weak pointer can be thought of as a variation of shared pointer that does not contribute to the reference count.
The textbook example of weak pointers is to represent cycles in graphs, while avoiding memory leaks caused by circular dependencies.
Let’s consider a tree where each node holds child and parent pointers. Our minimal example only needs two nodes: nodeA
is the root of our tree and the parent of nodeB
.
To illustrate the need for a better solution, let’s see why an implementation using only shared pointers won’t suffice…
Do you notice anything that will go wrong here?
Let’s walk through what exactly is going on…
- We construct a shared pointer called
nodeA
along with an instance ofNode
, which lives on the heap. This newNode
object will have a reference count of 1.
- Same as above, but with
nodeB
.
- We set the shared pointer
nodeA::child
to point to the managedNode
object represented by the shared pointernodeB
, increasing that object’s reference count to 2.
- Similar to above, we set the shared pointer
nodeB::parent
to point to the managedNode
object represented by the shared pointernodeA
, increasing that object’s reference count to 2.
- Now, execution reaches the end of
SomeFunction
, destructing the shared pointersnodeA
andnodeB
. Upon destruction, these shared pointers decrement the reference count of their managed objects, but they do not clean up those objects, because the reference counts are still non-zero.
We have a memory leak 😰…
Isn’t that what smart pointers were supposed to help resolve?
Weak pointers to the rescue!
First, we need to recognize that the real issue above was the circular dependency between two shared pointers. To fix this, we just need to break the circular dependency by replacing the shared pointers with weak pointers in one direction of the graph.
Let’s say we (arbitrarily) want parent nodes to strongly own their children, and children to weakly own their parents. Our modified code will look like this:
Note how in the above implementation, both variables nodeA
and nodeB
in main
are still shared pointers even though the object pointed to by nodeB
is the child of the object pointed to by nodeA
.
Let’s take a deeper look at this modified implementation…
- Our memory diagram below shows the state just before returning from
main
. Notice how the reference count of the managed objectNode (A)
is only 1 because the weak pointer referencing the object does not contribute to the reference count.
- When we do return from
main
, thenodeA
andnodeB
variables will be destructed, reducing each of the managedNode
objects’ reference counts by 1.
- Still in the process of returning from
main
, theNode
object representing the parent nodeNode (A)
now has a reference count of zero, causing thedelete
operator to be applied, which destructs and deallocates the object.
- In deleting the parent node, the child node’s reference count decremented to zero, and thus
Node (B)
will also have thedelete
operator applied.
The above chain of events all happens while returning from main
in our implementation utilizing weak pointers.
Memory leak resolved!
Wrapping Up
(pun intended 😉)
Smart pointers are powerful tools that drastically increase engineer productivity by reducing the need to track every new
and delete
keyword.
There are many instances of RAII throughout C++ (and software development in general). Now that you’re familiar with the concept, you’ll begin to notice this popping up all over the place.
One of the most common use cases of RAII aside from smart pointers is std::lock_guard
, which automatically acquires and releases a std::mutex
when writing multi-threaded applications — more on that in a future post, though 🤓.
What’s Next?
As I mentioned earlier, there are quite a few building blocks required to fully understand std::make_unique
. Those same building blocks happen to be the foundation of many advanced topics in real world C++.
Beginning next week, we’ll start to dive into each of those building blocks to achieve a more well-rounded understanding.
If you found this post useful, please clap & subscribe!
When you subscribe, you’ll receive C++ Beyond the Syllabus for free, directly in your inbox.
Sources & More Info
- C++ Docs — unique_ptr
- C++ Docs — shared_ptr
- C++ Docs — weak_ptr
- University of Michigan — EECS 482 — Discussion of RAII