C++ Beyond the Syllabus #6: Move Semantics Pt. 2 — Rvalue References & The Rule of Five
An rvalue reference is like a magic glove letting you temporarily catch, hold, and modify a bubble without popping it.
Not a Medium member? View this entire article here!
This article is a part of the C++ Beyond the Syllabus series. Subscribe here to receive each new issue directly in your inbox.
Move semantics is a powerful tool that can drastically increase the performance of your code by reducing unnecessary and expensive copies of data on the heap. Before you can get to move semantics though, you need a sound understanding of rvalue references.
In the last article, we covered the basics of lvalues, rvalues, and an argument for shallow copies. If any of those sound unfamiliar to you, I’d definitely recommend taking a quick look at that post before continuing:
Lvalue & Rvalue References
Back To The 5 Year Olds!
In the previous article, we related lvalues to a stuffed animal on the shelf and rvalues to a bubble….
Lvalue references:
- Recall your stuffed animal, named lvalue, who has a specific spot on the shelf
- Let’s say your friend wants to borrow lvalue, so you let them with a couple ground rules… They can play with lvalue, take it on walks, and even dress it up or cut its hair, but at the end of play time, they must always put lvalue back in its spot on your shelf.
- And your friend’s name? lvalue reference, of course.
Rvalue references:
- Recall the bubble you can play with, but not hold because it will inevitably pop. That bubble was an rvalue.
- Rvalue references are like a magic glove. It lets you temporarily catch, hold, and modify the bubble without popping it.
- The bubble is still going to pop as soon as you take the magic glove off; it just lets you use the bubble a little while longer.
Let’s Get Technical With It
Lvalue references are the traditional reference type denoted by T&
for some type T
. You are probably pretty familiar with these already, so I’ll skip right to rvalue references.
In code, an rvalue reference to some non-template type T
is denoted as T&&
. The most probable place you will have already seen this is as an argument to an object’s rvalue copy constructor and rvalue copy assignment operator.
Let’s work our way up to those…
The Use Case For Rvalue References
The Big Three
“The Big Three” is a concept taught to almost all C++ students. The main gist is that any class managing dynamically allocated resources (i.e. — stored on the heap), should also implement a…
- destructor to clean up dynamically allocated data
- copy constructor to perform deep copies of dynamically allocated data
- copy assignment operator to clean up old data and deep copy new data
Here’s a simple implementation of a class that implements The Big Three:
The concept of The Big Three was created prior to C++11, which introduced rvalues, rvalue references, and move semantics.
Since then, The Big Three has been augmented to the “Rule of Five,” which is also sometimes taught in school, but often not given the love it deserves.
The Rule of Five
The Rule of Five includes the three special member functions from The Big Three, along with two additional functions…
- rvalue/move copy constructor
- rvalue/move copy assignment operator
Note: The above are often referred to by “rvalue copy X” and “move copy X” interchangeably, for the remainder of this article, I will call them by “rvalue copy X.” As I explain in this series, the functions themselves don’t really move anything. You could theoretically swap all of the code from the lvalue and rvalue versions of these functions (along with some other code changes) and everything would behave the same.
Before getting into the syntax, know that the main point of these new function overloads is to facilitate the transfer of ownership of resources (i.e. — move semantics) as an alternative to performing a deep copy.
Recall the value proposition of transferring ownership:
We can eliminate unneeded copies of dynamically allocated data when copying from an object that will not use the managed data again.
Below, we modify our Object
class for the Rule of Five…
Notice the only difference in function stub of the two new functions and their Big Three counterparts is that the new functions accept their argument as an rvalue reference instead of an lvalue reference.
The Transfer of Ownership
This seems pretty intuitive, but it’s worth explicitly stating as they are prone to error. As a knowledgeable software engineer writing rvalue copy constructors and rvalue copy assignment operators on a regular basis, it’s good to keep these explicit steps in mind…
Responsibilities of the rvalue copy constructor:
- Initialize dynamically allocated member variables with shallow copies of
other
’s dynamically allocated members. - Reset
other
’s dynamically allocated members to some valid, destructible state. (Note, for simplicity, my example’s destructor accounts fornullptr
.)
Responsibilities of the rvalue copy assignment operator:
- Clean up the dynamically allocated resource of the current object.
- + all responsibilities of the rvalue copy constructor
A common trick for the rvalue copy assignment operator, which takes care of all needed responsibilities is to simply swap the managed resources of the current object and other
like below:
The std::swap
syntax actually just invokes the rvalue copy constructor and rvalue copy assignment operator on an object/resource. This works because move semantics (i.e. — the act of invoking these rvalue function overloads) makes no guarantees about the state of the moved-from object other than it must be valid and destructible. So, we can simply swap it with whatever current valid and destructible state our moved-to object had.
Your Burning Question
The rvalue references as arguments to the above rvalue copy constructor and rvalue copy assignment operator are denoted as Object&& other
. That other
thing… that’s a name…
Does this mean rvalue references are… lvalues?
YES!
That was a great question 😉. It’s pretty counterintuitive, but rvalue references actually satisfy all of the conditions for lvalues that we discussed in our last article. Recall: an lvalue…
- has a name
- can appear on either side of an assignment operator (
=
) - can have its address taken with the address-of operator (
&
)
So, what happens when we need to access an rvalue from an rvalue reference? This is where std::move
comes into play. It’s not very intuitive, so I’ll cover it in-depth later in this series.
Reference Binding
Now that we are on the same page for where rvalue references are used, you might have some questions about how these functions are invoked.
To get there, we have to talk about reference binding.
Lvalue Reference Binding
You’ve been binding lvalue references since your 100-level CS course. Here’s a simple example:
The above code compiles. As you see on lines two and three respectively,
an lvalue reference can bind to an lvalue or another lvalue reference.
Now consider this example:
Let’s break this down…
non-const lvalue reference to type ‘int’
— that must refer toint&
in our codetemporary type of ‘int’
— this is effectively another way of saying rvalue int- Thus, the error tells us we can’t bind an non-const lvalue reference to an rvalue
Hmmm… but why does it specify non-const?
Well, if we make the lvalue reference const and throw it in Compiler Explorer, we see it compiles (link):
From looking at the generated assembly with optimizations disabled, we can see the compiler writes the value 5
on the stack and stores the reference to that value in lvalue_ref
.
But lvalue_ref
doesn’t reference an rvalue then, right?
Right, not anymore.
C++ has a type of implicit conversion called temporary materialization allowing rvalues to be converted into lvalues when bound to a const lvalue reference. The lifetime of the once-temporary variable is then bound to the scope of the const lvalue reference.
Rvalue Reference Binding
Rvalue reference binding is actually quite simple. There is one rule:
Rvalue references can only bind to rvalues.
Remember the magic glove analogy — just like the glove, rvalue references allow you to modify the rvalue it is bound to.
Const qualifiers behave similarly to those on lvalue references, with the exception that there is no such thing as a const rvalue. There is, however, a thing referred to as ravlue-to-const, which isn’t used too often — I’ll touch on that more when discussing std::move
in my next article.
Note that while const rvalue references exist, they really aren’t used that often. The main use case of rvalue references in practice is to enable move semantics (i.e. — transfer of ownership) and any const qualifiers would prevent that from happening.
Quick Recap
Now, you’re probably like “woah — lots of information… what about that really matters?”
Here’s a chart to sum up the important takeaways.
Overload Resolution
Sooo, that might leave you with another burning question:
How does overload resolution work?
Let’s start with what you’re most likely to be familiar with already. Consider a simple example with two function definitions accepting an lvalue reference as an argument, differing only by constness. What will the following code output?
If you’re unsure, plug it into your IDE or Compiler Explorer (link). You’ll find that when dealing with the same value type (i.e. — lvalues), overload resolution favors functions with parameters of the same constness as the arguments.
But what about when passing an rvalue argument into an overloaded function?
Well, from our chart above, we know acceptable function overloads would accept parameters of type
- const lvalue reference,
- rvalue reference, or
- const rvalue reference
Consider our running Object
example. You may have noticed there’s really no reason the lvalue copy constructor or lvalue copy assignment operator need to expect non-const parameters — in fact, they probably should be const
for the sake of clean code!
But if we make that change, we need to know how the compiler will resolve those overloads at compile time.
Let’s throw together a simplified example:
Copying this code snippet into Compiler Explorer (link), we find the output to be
rvalue_ref: 5
This makes sense. We should probably expect the general rule of thumb for overload resolution that of the potential valid argument-parameter matches:
- values of the same type and constness are most favored
- followed by values of the same type, but different constness
- followed by values of different type
Simply commenting out the rvalue reference overload (func(int&& rvalue_ref)
) in our example yields the output,
const_rvalue_ref: 5
which confirms the above rule of thumb.
For reference, here is the binding chart from above modified for overload resolution priority:
What’s Next?
If you’re following along with this Move Semantics series, you might be like, “okay, so we can pass an rvalue into the rvalue copy constructor or copy assignment operator to initiate a transfer resource ownership, but what if we want to transfer ownership of resource from an lvalue?”
That’s a good question, and quite frankly, transferring ownership from lvalues is usually the goal when using move semantics in practice.
Now that you’re a pro on lvalues, rvalues, lvalue references, rvalue references, and understand how the transfer of ownership of resources actually works, there’s a few more topics we need to address:
- Universal references (yep! there’s another important reference type 😅)
- Perfect Forwarding (
std::forward
) - Forcing lvalues to transfer ownership (
std::move
) - When to use (and not use) move semantics
In the next two articles of this series, I’ll demystify these topics so you know when, where, how, and why to use move semantics and related tools.
Sources & More Info
- Effective Modern C++ by Scott Meyers, Items 23–25, 28
- “C++ Rvalue References Explained” by Thomas Becker
- “Understanding lvalues and rvalues in C and C++” by Eli Bendersky
Collaborators
A special thanks to a few friends who took time to edit and review this post:
- Mason Wilie (LinkedIn)