C++ Beyond the Syllabus #5: Move Semantics Pt. 1 — Lvalues, Rvalues & A Case For Shallow Copies
If you cut corners learning move semantics, other engineers will know.
This article is a part of the C++ Beyond the Syllabus series. Subscribe here to receive each new issue directly in your inbox.
Understanding lvalues and rvalues is not generally something you need to know to write code that produces the correct output…
But if you’re using C++, you also care about performance, and being comfortable working with these value types is a necessary prerequisite to understanding advanced, performance-related, modern C++ topics like move semantics and perfect forwarding.
By the end of this article, you’ll understand what lvalues and rvalues really are and how to distinguish between them, along with a little sneak peak at why they matter in the context of move semantics.
Lvalues & Rvalues
Explain It Like I’m a 5 Year Old
Lvalues:
- Imagine you have a stuffed animal that you keep on your shelf… you give it a really cute name, like Stuffy or Snuggles, or better yet: lvalue
- You know exactly where you keep lvalue on your shelf.
- You can pick lvalue up and play with it, but you always put it back in its spot on the shelf.
Rvalues:
- Unlike your stuffed animal lvalue, rvalues are more like a bubble
- You can look at the bubble, but you can’t pick up the bubble or save it for later
- The bubble is fun and temporary — it will pop soon and will disappear
Let’s Get More Technical
While compiler implementations and optimizations may vary, the theoretical gist of lvalues and rvalues in C++ remains the same.
Every variable or literal is either an lvalue or an rvalue.
An lvalue (locator value or left hand value) is stored in some stable memory location. It can be accessed and modified throughout its lifetime via its memory address.
An rvalue (read value or right hand value) is temporary. It does not have a memory address and disappears after the end of the expression it appears in (i.e. — at the next semicolon).
In English, Please?
The two main ways to identify some value as an lvalue is if…
- the variable has a name, or
- you can take the address of the value with the address-of operator (
&
)
Rvalues are defined as the converse of the lvalue definition… if a value is not an lvalue, then it is an rvalue.
If it helps, here’s another distinction:
- an lvalue can appear on either side of an assignment operator (
=
) - an rvalue may only ever appear on the right hand side of an assignment operator
Let’s See Some Examples
For starters…
Here’s an example of an lvalue without a name…
Here’s an example of a triple whammy: a value that is definitely an rvalue…
At this point, you’re probably thinking “this isn’t that hard, what’s all the hype about?”
Why do these matter?
Like a lot of people, my intuition was that these value types play into the C++ memory model; something along the lines of:
- lvalues can have their address taken, which means they must be stored in memory
- rvalues can’t, so maybe that means they live only in registers
But, the C++ standard leaves almost all implementation details up to compiler developers, providing them freedom to reorganize and optimize your code. While the above intuition is conceptually correct, the standard allows compilers to store values wherever they’d like, as long as the interface of these value types does not change.
So if it’s not for the memory model, what is it?
In practice, the distinction between lvalues and rvalues enables functions to branch at compile time via overload resolution.
Before we get into that though, let’s take a look at the motivation for all of this.
The Motivation: An Argument For Shallow Copies
Remember “deep copies”?
As a refresher, this is when you have some type like a std::string
or std::vector
, which stores a pointer to allocated data on the heap. The implementation of this type then must make sure that both the copy constructor and copy assignment operator “deep copy” the data on the heap owned by the copied-from object and then set the copied-to object’s data pointer to the newly made copy of the data on the heap. This prevents a whole host of problems, like unintended updates to data, access after deletion, double deletion, etc.
Great, we’re all caught up.
But what if we don’t want a deep copy?
Well, when wouldn’t we want a deep copy — doesn’t that just protect us from all the terrible side effects listed above?
Consider this ultra-simple and rather hand-wavy example…
In the example above, the std::string
copy constructor will initialize str2
with a deep copy of the ~incredibly long string~ owned by str1
.
Being an ~incredibly long string~, you might imagine this would take a toll on performance. You also might realize this performance hit is entirely unneeded because str1
is never used again after line 6!
Instead of creating a deep copy, we can just transfer ownership of the ~incredibly long string~ from str1
to str2
. In doing this, str2
can just shallow copy str1
’s data pointer and we can set str1
’s data pointer to nullptr, which would be a much smaller performance hit than allocating the entire ~incredibly long string~ again.
This actual transfer of ownership is called move semantics.
For now, think about this:
If we have one function overload to perform a deep copy and another to perform a transfer of ownership, how can we instruct the compiler which one we want to use at compile time?
The answer: rvalue references.
What’s Next?
Next up, I’ll cover rvalue references, the Rule of Five, universal references, and finally, move semantics!
Move semantics is an essential tool in the C++ programmer’s toolbox. It can reduce unnecessary copies and allocations in your code, which often are the largest latency contributors.
The catch:
Move semantics is tricky! It is hard to understand and even harder to get right. It’s one of those topics that if you take a shortcut while learning, you’ll quickly realize you need to go back to the drawing board.
Having learned the ropes of lvalues & rvalues gives you a solid foundation to understand move semantics the right way, once and for all!
Sources & More Info
- Effective Modern C++ by Scott Meyers, Items 23–25, 28
- “C++ Rvalue References Explained” by Thomas Becker
- “Understanding lvalues and rvalues in C and C++” by Eli Bendersky
Collaborators
A special thanks to a few friends who took time to edit and review this post:
- Janay Bengal (LinkedIn)