In-depth article [C++] analyzes the difference and application value of rvalue references and lvalue references in C++11

In-depth article [C++] analyzes the difference and application value of rvalue references and lvalue references in C++11

  • 1. lvalue references and rvalue references
  • 2. Comparison of lvalue references and rvalue references
  • 3. Application scenarios and value
    • Ⅰ. Scenario 1: Function passes value and returns copy
      • ①.Move assignment
      • ②.Mobile copy
    • Ⅱ. Scenario 2: Container insertion interface
    • Ⅲ. Scenario 3: Perfect forwarding

1. lvalue reference and rvalue reference

Before introducing rvalue references and lvalue references, we need to understand what is an lvalue and what is an rvalue.

1. What is an lvalue? lvalue reference?

1. An lvalue is an expression that represents data, such as a variable or a dereferenced pointer.
2. An lvalue can obtain its address.
3. The left value can be modified.
4. An lvalue can appear on the left side of an assignment symbol, but an rvalue cannot appear on the left side of an assignment symbol.
[Question ①] Is a variable modified with const an lvalue?
the answer is! The content of a variable modified with const cannot be modified, but its address can be obtained!
[Question ②] What is an lvalue reference?
Answer: An lvalue reference is an alias for an lvalue.

int main()
{<!-- -->
// The following p, b, c, *p are all lvalues
int* p = new int(0);
int b = 1;
const int c = 2;

//Because the addresses of p, b, c, *p can be obtained
\t
//The following are lvalue references to the above lvalues
int* & rp = p;
int & rb = b;
const int & rc = c;
int & amp; pvalue = *p;
return 0;
}

2. What is an rvalue? rvalue reference?

1. An rvalue is also an expression that can express data, such as: literal constant, expression return value, function return value (return by value).
2. An rvalue cannot obtain its address.
3.Rvalue cannot be modified.
4. An rvalue can appear on the right side of an assignment symbol, but cannot appear on the left side of an assignment symbol.
[Question ①] What is an rvalue reference?
An rvalue reference is an alias for an rvalue. Rvalue references are written with one more & symbol than lvalue references.

int fmin(int x, int y)
{<!-- -->
return x < y ? x : y;
}
int main()
{<!-- -->
\t
int x = 1.1, y = 2.2;
//The following are common rvalues
10;
//Literal constants cannot obtain addresses. Literal constants exist in the constant area.
x + y;
//Expression return value, the address of the expression cannot be obtained.
fmin(x, y);
//The function returns the value, and the address of this expression cannot be obtained.
\t\t
//The following are all rvalue references to rvalues
int & amp; & amp; rr1 = 10;

int & amp; & amp; rr2 = x + y;

int & amp; & amp; rr3 = fmin(x, y);
}

2. Comparison of lvalue references and rvalue references

Before rvalue references were introduced, were there no rvalues in the code? This is certainly not possible, so how are these rvalues treated normally like lvalues?
It is certain that an lvalue reference cannot refer to an rvalue, but an lvalue reference modified by const can refer to an rvalue.

int main()
{<!-- -->
    // An lvalue reference can only refer to an lvalue, not an rvalue.
    int a = 10;
    int & amp; ra1 = a; // ra is an alias of a ----> lvalue reference
    //int & amp; ra2 = 10; // Compilation fails because 10 is an rvalue
    
    // A const lvalue reference can refer to both an lvalue and an rvalue.
    const int & ra3 = 10;
    //After adding the const modified lvalue reference, you can reference the rvalue.
    const int & ra4 = a;
    return 0;
}

For rvalue references, only rvalues can be referenced, not lvalues. But the C++ library provides a function move. It can convert an lvalue into an rvalue, that is, it will return an rvalue after using move.

int main()
{<!-- -->
 // Rvalue references can only refer to rvalues, not lvalues.
 int & amp; & amp; r1 = 10;

 int a = 10;
 int & amp; & amp; r2 = a;This is wrong
  // error C2440: 'initialization': cannot convert from 'int' to 'int & amp; & amp;'
 // message: Cannot bind lvalue to rvalue reference

 // But rvalue references can refer to lvalues after move
 int & amp; & amp;r2 =std::move(a)
 //move will return an rvalue, here it just returns an rvalue, but a is still an lvalue.

Summary:
1. An lvalue reference can only refer to an lvalue, but a const-modified lvalue reference can refer to either an lvalue or an rvalue.
2. An rvalue reference can only refer to an rvalue, but it can refer to an lvalue after move.

Seeing this, you may feel that rvalue references are of little use. Const-modified lvalue references can refer to rvalues. Then just write all const-modified references.
Rvalue references are indeed useless from the above aspects, but in some aspects, they are extremely useful. I can’t explain it in a few words here. Next, I will explain the charm of rvalue references in three scenarios!

3. Application scenarios and value

Lvalue references and rvalue references are both used to improve efficiency!
1. Where are lvalue references used?

1. As a function parameter 2. As a function return value 3. Reduce copies

The core value of lvalue references is to reduce copies and improve efficiency. But there are also limited scenarios:
① When used as a function parameter, it is very good to use lvalue references, which can reduce copying.
② When used as a function return value, the premise here is that the variable is either a static variable or a global variable. Anyway, after the function ends, the reference variable still exists. Only in this scenario can lvalue references be used. Reduce copies.
③So when the variable is a local variable, you cannot use an lvalue reference as the function return value, you must use return by value!

1. Don’t ask me why copy is made when returning by value. Don’t you understand this?
Because when returning by value, the variable is destroyed after the function ends, so a temporary variable will be copied to store the return value. There is a copy here.
2. When the return value is a built-in type, the copy cost is low. When the return value is a custom type, the copy cost is very high. Because the copies are all deep copies, space needs to be opened.

2. Where are rvalue references used?
The core of rvalue references is also to reduce copies, and further reduce copies to make up for the scenarios that are not solved in lvalue references: for example, the above-mentioned function value return requires copying. So how are rvalue references resolved? Let me just say this: transfer resources! Transfer resources directly.

I. Scenario 1: Function passes value and returns copy

This scenario further involves classes that require deep copying in custom types and classes that need to be returned by value.

[Question] Why are those classes in custom types that require deep copying?
① If the object is a built-in type, the cost of copying is very low, because the maximum type is also reduced to double, so you don’t need to worry about the cost of copying. So the main consideration is custom types.
②If there is no deep copy operation in the custom type. Then there is no need to consider, the consumption of these operations is not very large. But if it is a deep copy of a custom type, then the consumption will be huge. Not only does it need to open a space as large as the object, but also finally frees up the space.
③The key is that we don’t know what custom type this object is. Is it a tree? Linked list? Or what? It’s scary to think about it. Moreover, lvalue references cannot work in this scenario, because the value is returned here.


Here I use a custom type string to demonstrate: This string class is hand-made by myself. I have written about the hand-made string class and copied it directly here for demonstration. Delete the unimportant parts.

#pragma once
#define _CRT_SECURE_NO_WARNINGS 1
#include <assert.h>
#include <string.h>

namespace tao
{<!-- -->
\t
class string
{<!-- -->
 public:
 \t     //Constructor
string(const char* str="")
{<!-- -->
cout << "string(const string & amp; s) -- deep copy" << endl;
_size = strlen(str);
_capacity = _size;
_str = new char[_capacity + 1];
memcpy(_str, str,_size + 1);
}
//copy construction
string (const string & amp; s)//deep copy
{<!-- -->
cout << "string(const string & amp; s) -- deep copy" << endl;
_str = new char[s._capacity + 1];
memcpy(_str, s._str,s.size() + 1);
_size = s._size;
_capacity = s._capacity;
}
//Assignment operator overloading
string & amp; operator=(const string & amp; s)
{<!-- -->
cout << "string & amp; operator=(string s) -- deep copy" << endl;
if (*this != s)
{<!-- -->
char* tmp = new char[s._capacity + 1];
memcpy(tmp, s.c_str(), s._size);
delete[] _str;
_str = tmp;
_size = s.size();
_capacity = s._capacity;
\t\t\t\t
}
return *this;
}
~string()
{<!-- -->
delete[] _str;
_str = nullptr;
_size = _capacity = 0;
}
size_t size() const //generally read only, not modified
{<!-- -->
return _size;
}
char & amp; operator[](int pos)//You can return by reference, because the function value is still there
{<!-- -->
assert(pos < _size);
return _str[pos];
}
//There are two overloaded types, one is the above and the other is a const-modified object, which is read-only and not modified.
const char & amp; operator[](int pos) const
{<!-- -->
assert(pos < _size);
return _str[pos];
}
iterator begin()//begin returns an iterator pointing to the beginning
{<!-- -->
return _str;
}
iterator end()//end returns the next position pointing to the last character
{<!-- -->
return _str + _size;
}
const_iterator begin()const
{<!-- -->
return _str;
}
const_iterator end()const
{<!-- -->
return _str + _size;
}

void reserve(size_t n)
{<!-- -->
if (n > _capacity)
{<!-- -->
char* temp = new char[n + 1];
memcpy(temp, _str,_size + 1);
delete[] _str;
_str = temp;
_capacity = n;
}
}
//increase
void push_back(char ch)//For the first time of tail insertion, consider whether expansion is needed. --->It is best to use reserve to expand capacity.
{<!-- -->
if (_size >= _capacity)
{<!-- -->
//You can directly expand the capacity by 2 times, but you should pay attention to one situation, when it is an empty string
reserve(_capacity == 0 ? 4 : 2 * _capacity);
\t\t\t\t
}
_str[_size + + ] = ch;
_str[_size] = '\0';
}

void insert(size_t pos, size_t n, char ch)
{<!-- -->
//The first step is to check the legality of pos
assert(pos <= _size);
//Check whether expansion is needed---》Directly use reserve to expand capacity
if (_size + n > _capacity)
{<!-- -->
reserve(_size + n);
}
//The third step moves the data
size_t end = _size;
while (end >= pos & amp; & amp;end!=npos)
{<!-- -->
_str[end + n] = _str[end];
end--;
}
for (int i = 0; i < n; i + + )
{<!-- -->
_str[pos + i] = ch;
}
_size + = n;

}

void insert(size_t pos, const char* str)
{<!-- -->
//The first step is to check the legality of pos
assert(pos <= _size);
//Check whether expansion is needed---》Directly use reserve to expand capacity
size_t len = strlen(str);
if (_size + len > _capacity)
{<!-- -->
reserve(_size + len);
}
//Move data
size_t end = _size;
while (end >= pos & amp; & amp; end != npos)
{<!-- -->
_str[end + len] = _str[end];
end--;
}
for (int i = 0; i < len; i + + )
{<!-- -->
_str[pos + i] = str[i];
}
_size + = len;
}

//delete
void erase(size_t pos, size_t len=npos)
{<!-- -->
assert(pos <= _size);
if (len == npos || pos + len > _size)//Deleted
{<!-- -->
_str[pos] = '\0';
_size = pos;
_str[_size] = '\0';
}
else
{<!-- -->
size_t end = pos + len;
while (end <= _size)
{<!-- -->
_str[pos + + ] = _str[end + + ];
\t\t\t\t\t
}
_size -= len;
}
\t\t\t
}
void clear()
{<!-- -->
_str[0] = '\0';
_size = 0;
}
//Check/Change
size_t find(char ch, size_t pos = 0)
{<!-- -->
assert(pos <= _size);
for (size_t i = pos; i < _size; i + + )
{<!-- -->
if (_str[i] == ch)
return i;
}
return npos;

}

void resize(size_t n,char ch='\0')
{<!-- -->
if (n < _size)
_size = n;
else
{<!-- -->
reserve(n);//regardless of whether n is greater than capacity, just expand it to n
for (size_t i = _size; i < n; i + + )
{<!-- -->
_str[i] = ch;
}
_size = n;
_str[_size] = '\0';
}
}
private:
char* _str;
size_t _size;
size_t _capacity;
public:

size_t static npos;
};

size_t string::npos = -1;

};

①.Move assignment

Here we will give an in-depth introduction to rvalues, which are also called dying values. Why is it called checkmate value? Generally, the life cycle of some rvalues is only one line. On the next line, the rvalue is destroyed, so it is called a dying value. For example, the return value of a function is a dying value. For built-in types, rvalues are called prvalues, and for custom types, they are called checkvalues.

What will happen if you look at the above code after the introduction?

So the above code will perform two deep copies. The first time it calls copy construction to create a temporary object, and the second time it calls assignment overloading. The cost of two deep copies is too high. But there is no way.
But the boss noticed a detail: Do you remember that the address of this func function cannot be obtained, which means that the return value of the func function is an rvalue, and the return value of the func function is a custom type, so this rvalue It’s a dying value. The life cycle is in this line. The boss takes advantage of the characteristics of the dying value that is about to be destroyed, and uses the “Star Absorbing Technique” on it to suck away all the resources of this dying value, and then give it what he doesn’t need. , in the end, there is no space opened, no deep copy, the ret variable obtains the desired resources.
So the boss wrote move assignment based on this idea.

When the object to be assigned is an rvalue, move assignment is called. When the object copied is an lvalue, ordinary overload assignment is called.

Some people may think: Huh? Are rvalue references so popular? Then let’s use all rvalue references!
This is not possible! The appearance of rvalue references is to distinguish lvalues from rvalues, because when the object copied/assigned is an lvalue, the lvalue will not be destroyed immediately. Its life cycle is still long. You can use all its resources at once. Take it away, is this reasonable? Of course it is unreasonable. The targets of rvalue references are those dying values. For lvalues, we can only follow the deep copy method.

So what happens after you figure out move assignment?

Results:
1. After writing the move assignment, although the number of copies is not reduced, the space consumption of a deep copy is reduced!
2. When the assigned object is an rvalue, move assignment is called. When the assigned object is an lvalue, ordinary overloaded assignment is called. When it is an rvalue, the efficiency is greatly improved!

②.Move copy

We can not only overload the move assignment of the assignment operator, but also overload the move copy of the copy constructor, because after overloading, there is no problem for the whole. When the copied object is an lvalue, then the copy constructor is called. If the copied object is an rvalue, then move copy is called.
/Next, let’s analyze the following similar scenarios:

We need to understand that the compiler will optimize continuous structures into one structure. Assuming that the above scenario is not optimized, what if the compiler optimizes it? What will happen?
For such a scenario: continuous construction + function expression returned by value
The compiler will optimize:

1. Continuous structures/copy structures merge into one.
2. The compiler will recognize str as an rvalue, which will be a dead value.
【Question ①】How to combine the two into one?
Before the function ends, let str be used as the copy object, and ret calls the copy constructor. Rather than assigning a value after the function ends, because str is destroyed after the function ends, it needs to be copied before the function ends, that is, str is returned before the function ends, and then str is treated as a dying value. This step is compilation We can’t see it.
[Question ②] Why is str recognized as a dying value?
Because it is more conceptual to identify str as a dying value. If the compiler does not optimize, the return value of the func function is also a dying value. After the compiler optimizes, the return value of func is str. In such a pair, str should theoretically be There is no problem in identifying it as a dying value and treating str as a dying value. Anyway, str is about to be destroyed.

In this case, the final process only calls the mobile copy. The original method of calling the copy structure for deep copy has now changed to the current method of only calling move copy. You think it is awesome or not?

Here, the mobile copy directly transfers the resources of str to ret. There is no space in the middle, the space of str is directly transferred to ret.

What’s more, the compiler has also optimized scenarios that are not continuous construction/copying + value-returning function expressions:

So after C++11, all containers have added move semantics such as move copy and move assignment.

II. Scenario 2: Container insertion interface


Do you know what happens?

Maybe you are confused now, huh? Where did you get the deep copy?
2Fb81bec301b214f34bb96429734d0ef1a.png & amp;pos_id=img-fnIPVwt2-1695891388131) Therefore, inserting a string class object will cause a deep copy. How about this? It’s too consuming! It does not only refer to string class objects, but also to objects of custom types that require deep copying.
At this time, rvalue references come into play again!
What happens if the inserted object is an rvalue? The last step to call is to move the copy!

In the past, if there was no rvalue reference, then if an lvalue was inserted, the copy constructor would be called. If an rvalue was inserted, (received by a const lvalue reference), the copy constructor would still be called.
So after using rvalue reference, insertion becomes more fragrant.


So after C++11 came out, all container insertion interfaces added rvalue references.

Ⅲ. Scenario 3: Perfect forwarding

int main()
{<!-- -->
int a = 10;
int & r = a;
int & amp; & amp; rr = move(a);

int & amp; & amp; rrr = 10;

cout << & amp;r << endl;
cout << & amp;rr << endl;
cout << & amp;rrr << endl;
}


We introduced the characteristics of rvalues at the beginning: 1. The address cannot be taken. 2. It cannot be modified.
So can the address of r and rr above be printed?
rr and rrr are rvalue references, r is an lvalue reference, and the address of r can be obtained. What about rr?
This is not possible according to the rvalue properties, but the result is:

You can get them all, why? Doesn’t it mean that rvalue cannot obtain the address and cannot be modified?
We can understand it this way:
Rvalues cannot be modified, just like the literal constant 10, we cannot modify it.
But rvalue references can be modified! , you can imagine that the rvalue reference opens a space and stores the rvalue data in it. We did not say that we should directly change the rvalue. What we should do is the rvalue reference.

It should be noted that the address of an rvalue cannot be taken, but giving an alias to the rvalue will cause the rvalue to be stored in a specific location, and the address of that location can be obtained, which means that for example: literals cannot be taken The address of 10 is measured, but after rr1 is referenced, the address of rr1 can be obtained or rr1 can be modified. If you don’t want rr1 to be modified, you can use const int & &rr1 to reference it. Doesn’t it feel amazing?

So here’s the point: the properties referenced by rvalues are lvalues. Because I referenced someone else, I changed it to myself. Don’t change other people’s values directly in person.
An rvalue cannot be passed to an lvalue reference, but it can be passed to a const-modified lvalue reference, but an rvalue reference can be passed to an lvalue reference!
(Because the properties referenced by rvalues are lvalues)

Nature:
The compiler will recognize the attributes of rvalue reference variables as lvalues. Why is this? Because you are still quoting in essence!
If the reference cannot be modified, then why use the reference? If it cannot be modified, then in the scenario of move construction, the transfer of resources cannot be realized, so the rvalue reference must be modified! This is specified by the compiler. Otherwise, how to perform the following operations?

So the default attribute of an rvalue reference is an lvalue.

Okay, after the above problem was solved, another problem appeared:
So what if the attribute referenced by the rvalue must be an rvalue? Is there any way? Why does such a problem occur?
The big guys invented a function called perfect forwarding: it allows variables referenced by rvalues to maintain rvalue attributes.

What is perfect forwarding?
1. When t is an lvalue reference, the lvalue attribute is maintained.
2. When t is an rvalue reference, keep the rvalue attribute.

We also invented something called a universal reference: it can receive both lvalues and rvalues.
When the argument is an lvalue, it is an lvalue reference (a & amp; is automatically folded)
When the actual parameter is an rvalue, it is an rvalue reference.

tempalte <typename T>
void PerfectForward(T & amp; & amp; t)
{<!-- -->
Fun(forward<T>(t));
}

So why would anyone want an rvalue reference to retain rvalue properties?
Let’s take the string object we just inserted into the linked list as an example.

int main()
 {<!-- -->
list<tao::string>lt;
\t
tao::string s("Xiao Tao is here");
lt.push_back(s); //Insert lvalue and call copy constructor --> deep copy
cout << endl;

lt.push_back(move("Xiao Tao is gone"));//Insert the rvalue after move, which calls move copy and transfer resources.
cout << endl;

lt.push_back("Xiao Tao is the most handsome");//This is an implicit type conversion, and a temporary variable will be generated in the middle, which is the checksum value, and a mobile copy will be called.
}

The push_back(string && str) movement semantics already exist in the linked list. When an rvalue is passed, the move semantic version of insertion will be called. When an lvalue is passed, a normal insertion will be called. When we pass an rvalue, an rvalue reference will be made, but in the end, because the default attribute of the rvalue reference is an lvalue, we still use deep copy. This is because the attribute of the rvalue reference is not an rvalue, so we have to An rvalue reference needs to maintain its rvalue attribute before move copy can be called!


As for how to let rvalue reference variables save rvalue attributes, we have already told you above. That is to use the forward() function. This perfect forwarding must be added wherever rvalue references are involved! Otherwise, the function stack frame will be invalid when reaching the next layer.
Therefore, perfect forwarding operations are also added to all containers.