C++ Beyond the Syllabus #9: std::string_view

The modern C++ solution for read-only access of string data.

Jared Miller
6 min readAug 8, 2024

Not a Medium member? View this entire article here!

This article is a part of the C++ Beyond the Syllabus series. Subscribe here to receive each new issue directly in your inbox.

You’re probably familiar with both C-strings and C++’s std::string. You might be wondering why we need yet another string type. The answer is that std::string_view isn’t really another string type — it doesn’t actually own any string-like state.

A Quick Recap

C-Strings

C-strings are null-terminated (i.e. — end with '\n') arrays of characters used in both C and C++. These require careful management to avoid common issues like buffer overflows.

C-strings can be allocated pretty much anywhere depending on how they are defined…

const char* str_literal = "Hello!"; // Allocated in read-only static memory
char stack_str[100]; // Allocates memory on the stack
char* heap_str = new char[100]; // Allocates memory on the heap

If you use the new keyword to allocate a C-string (char array) on the heap, you must remember the delete[] keyword as well.

The only way to know the size of a C-string without caching it somewhere is by traversing until you come across the null-character.

std::string

The C++ STL introduced std::string to provide safer and more convenient string usage.

std::string maintains a C-string under the hood along with a ton of helper functions to help you efficiently operate on it.

For very short strings, std::string uses small string optimization (SSO) and avoids heap allocation, but this is very much the exception.

Why don’t these suffice?

Well, a lot of applications work with both C-strings and std::strings. Even when an application uses a single string representation throughout, it may use libraries that expect a different string type, requiring either redundant library code or unnecessary conversions and copies of data.

Prior to std::string_view, code had three main options to be compatible with both string types:

  1. Duplicate all logic: For any code operating on std::string, write equivalent code for C-strings. This doesn’t even work all of the time — it falls apart (or becomes far more convoluted), when you need both string types to be represented in a container.
  2. Handle all std::strings as C-strings: We can efficiently grab the underlying C-string from any std::string object, but then we’re effectively disregarding all of the great features that come with std::string.
  3. Convert all C-strings to std::strings: This requires traversing the entire C-string (which can be arbitrarily long!) to find the size, then allocating the memory to store it within a new std::string, and then traversing the entire C-string again to copy its contents over to the std::string! Needless to say this option won’t suffice in any performance-sensitive application.

Introducing std::string_view

Hopefully now you’re onboard with the need for a new way to efficiently handle strings of either type.

Unfortunately for us, we can’t have it all…

std::string is the C++ mechanism for making string manipulation safe and convenient — that already exists. If we want to manipulate strings of type C-string and std::string in a uniform way, we’re still going to have to convert all C-strings to std::strings or vice versa.

What we can do, however, is create an efficient read-only wrapper around C-strings, which can be used for std::string’s underlying C-string as well.

Designing std::string_view

If you were to design such a wrapper, how would you do it?

Here are a few observations/requirements to get us started. We want…

  • to make initializing the wrapper efficient.
  • instances of the wrapper to have a small memory footprint.
  • the wrapper to be capable of performing all of the same read-only operations as std::string.

All of these requirements are actually pretty simple to achieve.

The std::string_view implementation only has two member variables…

  • const char* pointing to the first character
  • size_t representing the size of the string

Together, these members require 16 bytes of memory (or 8B on a 32-bit system), which is pretty small in the grand scheme of things.

std::string_view also provides nearly all of the read-only helper functions that std::string provides, enabling convenient comparison, substring, and hashing functionality.

Constructing std::string_view

There are many constructors, but there are two that I think are worthy of a shoutout:

  1. constexpr string_view( const CharT* start, size_type size );

    This constructor is trivial. You pass in a potentially-const char* and string size (not including the null character). The time complexity to construct the std::string_view is then constant because the member variables can be directly initialized with these arguments.

    This is generally the constructor you’d use to build a std::string_view from a std::string because you’ll already have a pointer to the first character (std::string::data()) and the size (std::string::size()). If you happen to be storing the size of a C-string, you can use this constructor for those too.
  2. constexpr string_view( const CharT* start );

    This is generally the constructor you’ll use when building a std::string_view from a C-string because C-strings do not cache their size. This constructor has linear time complexity as it must traverse the C-string until it finds a null character.

Notably, there is no constructor with a std::string parameter. This is presumably because implicit construction of a std::string_view from a std::string would lead to confusing scenarios where the programmer is unsure if an owning or non-owning object was created. We’ll touch on other gotchas regarding ownership in the next section.

What’s the catch?

This isn’t the end-all-be-all future of string manipulation in C++. There are actually quite a few drawbacks and reasons not to use std::string_view.

They Are Non-Owning

If you use them right, this is a good thing. The fact that std::string_view does exactly what its name suggests (i.e. — view strings), as opposed to owning them, makes them an ultra-light weight way to view both C-strings and std::strings.

This same property can cause some hiccups, though. What do you suspect the following example will output?

#include <iostream>
#include <string>
#include <string_view>

std::string_view generate_string()
{
const std::string temp = "Temporary String";
std::string_view temp_view_1(temp.data(), temp.size());
std::cout << "temp_view_1: " << temp_view_1 << "\n";
return temp_view_1;
}

int main()
{
std::string_view temp_view_2 = generate_string();
std::cout << "temp_view_2: " << temp_view_2;
}

Let’s walk through it.

The std::string_view temp_view_1 is created atop of the local string temp. When generate_string returns temp_view_1, the underlying data goes out of scope!

If we run this example in Compiler Explorer, we unsurprisingly see that temp_view_2 is observing some garbage, undefined memory:

temp_view_1: Temporary String
temp_view_2: C��Oaq�

As the example above illustrates, std::string_view is similar to an lvalue reference or pointer in the sense that you need to make sure the underlying data outlives the view. If not, any accesses of the std::string_view will read undefined memory.

Likewise, std::string can grow in size (similar to std::vector). If the size of a std::string outgrows its capacity, a new, larger section of memory will be allocated and all of the underlying data will be copied to the new location. This will invalidate any std::string_view that pointed to the first character of the std::string before it grew. Any future accesses to such a std::string_view will read undefined memory.

The fix is simple:

Your code must ensure the lifetime of a std::string_view never outlives the underlying data it was constructed from.

When Not To Use

There are a few clear scenarios outlining when not to use std::string_view:

  1. If your application only works with std::strings, const std::string& is always more space-efficient than std::string_view as the single reference will be half the size of a std::string_view.
  2. When the lifetime of the underlying data is unknown. For example, you may not want to use std::string_view to represent string-like data returned from a function or when processing string data across multiple threads.

Wrapping Up

Large C++ applications often include various libraries, input sources, and other logic, which employ both C-string and std::string representations of string-like data. std::string_view is the go-to modern C++ solution for read-only access of string data in these applications.

What’s Next?

If you liked this article, please consider clapping and following!

Subscribe here to receive weekly-ish new issues of C++ Beyond the Syllabus directly in your inbox.

Sources & More Info

--

--

Jared Miller
Jared Miller

Written by Jared Miller

A C++ Software Engineer in high frequency trading. Excited about low-latency code, distributed systems, and education technology.

No responses yet