Value Types in C++11 | Ruminations

astrotycoon 2018-06-19

展开全文

You may have heard these terms used for various programming languages before, but I wanted to discuss them in a bit more detail since they’re a fairly fundamental concept in compilers that spill over into the way you use the languages themselves.

The terms “lvalue” and “rvalue” come from a seminal work by Christopher Strachey called Fundamental Concepts in Programming Languages, which is actually a composition of lecture notes from 1967.

At the core, lvalues and rvalues are quite simple concepts. An L-Value is named because it can appear on the left-hand side of an assignment operation, and an R-Value is named because it can appear on the right-hand side. However, the devil is in the details (so to speak).

More properly, an L-Value is a named value that has a memory location associated with it. For instance, when you declare a local integer variable with the name ‘x’, you are creating something that can be an L-Value. An R-Value is just that: a value. It does not require a memory location to be associated with it, nor does it require a name. For instance, literals are R-Values.

Interestingly, this means that L-values can be R-values. For instance:

int x = 12;

x = x + 100;

In the first statement, x is being declared as an integer. Then it is used as an l-value, to accept the literal number 12 as an r-value. In the second statement, x is used as an l-value, to accept the result of the expression x + 100. However, that expression uses two r-values: the literal 100, and the value of x. So even though x is an l-value, it undergoes an l-value-to-r-value conversion. The result of the expression x + 100 is itself an rvalue, having a value but no firm memory address.

This seems rather cut and dried, but in reality, the rules are more complicated. C++11 makes them even more complex by adding what are called “r-value references.” This change mutates the notion of traditional r-values slightly, causing some definition changes. The taxonomy used for C++11 involves several value types: lvalue, xvalue, prvalue, rvalue, and glvalue.

In this taxonomy, lvalue remains exactly the same. It represents a named item with a concrete memory location associated with it. For instance, functions, local variables, etc. The “named item” part is slightly tricky:

int& foo();

In this example, foo is an l-value, but so is the int& *result* of foo even though it’s not strictly named. But if you think about it more deeply, it really is named, just in an opaque manner. In order for you to return a reference from the function, you must return something that is named. Consequently, this is why it is legal for you to do: foo() = 5; Since the result is an l-value, you are allowed to assign into it.

The prvalue object is identical to what I described for r-values above. It stands for “pure r-value”, and is the traditional concept of r-values. For instance, the result of a function that return l-values (not l-value references) is an r-value. For example:

int foo();

foo is still an l-value, but the result of calling foo is an r-value. This is why you cannot do: foo() = 5;

xvalues are “eXpiring” values that refer to an object near the end of its lifetime, such as functions returning rvalue references. For instance:

int&& foo();

foo is still an l-value, but the result of calling foo is an xvalue. The && syntax is called an “r-value reference”, and it’s a way for you to signal to the compiler that an object can be safely “stolen” using move semantics. For more information on move semantics, see this.

xvalues are interesting in that they can be thought of as both an lvalue and a prvalue, depending on the context. If an xvalue object is named, then it is treated as an lvalue, otherwise it is treated as a prvalue. So in our example, the results of calling foo will be treated as a prvalue, so you cannot do foo() = “12”;. However, if you had an r-value reference parameter, you could use it as an l-value, like this:

void foo( int&& i ) {

i = 12;

}

These three value types (lvalue, xvalue and prvalue) are called the “fundamental” value types, and every expression belongs to exactly one of these types. The remaining two type descriptions are glvalue (a generalized lvalue) and rvalue. A glvalue is either an lvalue or an xvalue, and an rvalue is either a prvalue or xvalue. They are mostly used when requiring generalized statements about what value types are allowed.

So that’s a brief run-down of the various value types in C++11, what they mean and when they’re used. Now for some more in-depth examples. Imagine:

class Foo {

std::string Data;

public:

const std::string& Get() const { return Data; } // 1

std::string&& Get() const { return std::move( Data ); } // 2

};

void bar( const std::string& b );

int main( void ) {

Foo f;

std::string data = f.Get();

bar( f.Get() );

}

In this case, f defines an l-value of type Foo. Foo::Get and bar are also an l-values due to being a function.

When you call Foo::Get, which Get will be called in each of the statements?

The first statement will call version 2 of Get because of the rvalue temporary required for the assignment operator. The result of that call is an xvalue. Because it is unnamed, it will be treated as a prvalue and assigned into the lvalue named data.

The second statement will call version 1 of Get because there is no rvalue temporary involved. The result of the second statement is an lvalue, which will be converted into a prvalue when being passed into the call to bar. Once inside of bar, the parameter b will be treated as an lvalue again.

Hopefully this gives you a pretty good understanding of the various types of values in C++11 and how they’re used.

tl;dr: lvalues are things which have a concrete memory location for storage and can appear on the left-hand side of an assignment, (p)rvalues are things which have values and can appear on the right-hand side of an assignment.