C++ string class iterator range for

string class

In the C language, there are also strings, which are a collection of some characters ending with ” \0 “. In the C standard library, there are also some library functions for manipulating the str type. However, the functions of these functions It is not very comprehensive, and these operation functions are separated from the str type, which does not conform to the OPP idea. If users use these operation functions improperly, there may be problems such as accessing out of bounds, or failing to achieve the desired effect.

So in C++, these are optimized, packaged, improved, and the string class is made.

The string class is a managed character array, which contains additions, deletions, changes, and some functional algorithms.

Documentation for the string class:

https://cplusplus.com/reference/string/string/?kw=stringhttps://cplusplus.com/reference/string/string/?kw=string

We can use ctrl + F in the official document to search the contents of the current browser.

The string class belongs to the content of the C++ standard library, so it is included in the std namespace. When using it, either expand the namespace, or use std::string to access the content of this string:

string s1;
std::string s2;

string name("Zhang San");
name = "Zhang Fei";

Constructor of string class

string();
Create an empty string object, that is, create an empty string
string (const char* s);
Create this string object in the form of str in C, and end with ” \0 “.
string (size_t n, char c);
There are n characters of c in the string class.
string (const string & amp; str);
Copy constructor
string (const string & amp; str, size_t pos, size_t len = npos);
Copy constructor, start from a certain position (pos), copy len characters.
string (const char* s, size_t n);
Construct a string class, copy in the first n characters of the s constant string
int main()
{
string s1();
string s2("Zhang San");
string s3("hello world");
string s4(10 , '$');
string s5(s2);

string s6(s3, 2, 5);

return 0;
}

In the above copy constructor, string (const string & amp; str, size_t pos, size_t len = npos); the value of npos is -1, but npos is a static constant, unsigned number, and the initial value it gives is -1, also size_t, is an unsigned number, then -1 is the largest number.

Then it is also stated in the official documentation:

Indicates that if the given len is greater than the length of the string, or len = npos at this time; then it will be taken to the end of the string.

Access and traversal operations of string objects

Of course, in the string class, many overloaded functions are also implemented, such as = < <= >= == and other operators. The basic functions are similar, but there is a very powerful overload in the Stringe class. operator function (operator[]).

operator []

operator[] In C, the ” [] ” operator is equivalent to a dereferencing operation. This ” [] ” operator can only be used in continuous spaces, such as on the stack The array allocated on the server and the space allocated dynamically on the heap.

Then in the custom type string class, we can also use ” [] ” to access this string array.

Use subscript + [] to access the string custom type.

 string s3("hello world");
// directly type the content in s3
cout << s3 << endl;

// subscript + []
for (int i = 0; i < s3. size(); i ++ )
{
cout << s3[i];
}
cout << endl;

return 0;

output:

In the string class, the bottom layer uses an array opened on the heap to store strings, so since it is a string, it ends with ” \0 “, can our above loop access “\ What about \0”?

Let’s take a look at the size calculated by the size() function:

 cout << s3. size() << endl;

output:

We found that the output is 11, and the number of strings in the above “hello world” is 11, without adding “\0”.

Because “\0” is not a valid character, it is a special character that indicates the end of a string.

When we deliberately print out “\0” is actually accessible, but some compilers will not display this “\0” , but it will actually be accessible.

Then we can also modify the string in string like using ” [] ” to modify the array, because this ” operator [] ” function returns a reference to the current pos position passed in:

When we pass in an ordinary object, it is a reference to the POS position. If the object passed in is const, then the return value of this function is const char & amp;, a constant reference.

 string s3("hello world");
char s1[] = "hello world";

s1[1] + + ; // equivalent to *(s1 + 1);
s3[1] + + ; // equivalent to s3.operator[](1);

The disassembly of the above code is as follows:

We found that using s3[1] + + in the s3 object is actually an overloaded operator function called operator[] at the bottom layer.

A function about the capacity in string

Function Name Function Description
size (important) returns the effective character length of the string
length returns the effective character length of the string
capacity returns the total size of the space
empty (key) detection string If it is released as an empty string, it returns true, otherwise it returns false >

reserve (important) Reserve space for string**
resize (important) will work The number of characters is changed to n, and the extra space is filled with the character c

max_size() calculates how much space this string can reach:

However, the results calculated by max_size ( ) are different under different compilers, for example, the above results are the results output under the VS2022 environment.

But if it is in the VS2013 environment, the output is as follows:

So the max_size () function is not very useful in practice.

capacity() returns the capacity of this space:

 string s3("hello world");

cout << s3. capacity() << endl; // 15

Of course, we know in the data structure, such as in the stack, when the stack is full, we insert elements, then it will expand, and generally we expand to twice the original size, but the expansion size It varies under different versions of C++.

Under VS2019:

reserve (directly open up space)

We can use this function to directly open up space of n sizes for this object, as shown in the following example:

int main()
{
string s1;
s1. reserve(100);
size_t sz = s1. capacity();
cout << "capacity = " << sz << endl;


cout << "making s grow:" << sz << endl;
for (int i = 0; i < 100; i ++ )
{
s1.push_back('c');
if (sz != s1. capacity())
{
sz = s1. capacity();
cout << "capacity change: " << sz << endl;
}
}

return 0;
}

output:

We found that when the character ‘c’ is inserted later, there is no expansion operation, because 100 spaces have been opened up before the character ‘c’ is inserted. Open up space ahead of time.

Note is that the reserve we use is to reserve space for the string object, and if the size we give later is smaller than the original size, will the reserve() shrink the operation? He has his own judgment at the bottom:

For example, in the above code, we use the clear() function to clear the contents stored in the array, and then we use the reserve shrinking operation to achieve it. For example, in the above code, we use clear() to clear the content after expanding the capacity, and then shrink the capacity:

······
·····
······
s1. clear();
s1. reserve(10);
cout << "clear_capacity: " << s1. capacity() << endl;

output:

We found that from the previous size of 111, the size was reduced to 15.

However, if we shrink directly without using clear(), then it won’t work:

 s1. clear();
cout << "clear_capacity: " << s1. capacity() << endl;

Output: (As follows, still size 111)

Moreover, when we used reserve and open up space above, we passed in 100, but we found that 111 spaces were opened up in reserve, which may be related to memory alignment under VS, etc., but only Can be greater than or equal to 100, not less than 100.

resize open space + fill value initialization

resize is also to open a space, but when opening a space, if there is a newly opened space, then it will fill in the value and initialize in the newly opened space. If no initialization value is given in the function, then the default is to initialize with a null character (‘\0’). Initialize with given value, if given:

In the above example, the size (number of valid characters) in the string class has changed.

Specify the initial value:

If the given size in resize() is smaller than the size of the number, then he will delete the redundant data directly. However, like the previous reserve, it does not necessarily shrink, and there are still underlying judgments.

Actually, most of the shrinkage mentioned above will not shrink, and the implementation of string is relatively conservative; why not shrink easily? Because scaling down comes at a price. The shrinking here is not to return the permission of the unnecessary space to the operating system. We used malloc in the C language before to open up the space dynamically. This space is divided into two sections, the former section needs it, and the latter section does not need to return the permission to the operating system.

Actually, shrinkage is achieved by opening up a new space, the size of which is the size of the space after shrinkage; and then copying the data in the metaspace to release the original space.

If we want to shrink actively, then we can use the following interface implemented in C++11:

Official documentation:

string::shrink_to_fit – C++ Reference (cplusplus.com)

The shrink_to_fit mentioned above does not necessarily shrink to the specified size. You may consider the problem of memory alignment, and it may be larger.

However, it is still not recommended to perform shrinking operations, because there is basically no such requirement, and shrinking has a price.

Note:

  • The bottom-level implementation logic of the above two functions, size() and length(), is the same. The purpose of size() is to be consistent with the implementation of other containers.
  • The clear() function only clears the valid characters in the underlying array of the string class, and does not change the size of the array.
  • .resize(size_t n) and resize(size_t n, char c) both change the number of valid characters in the string to n, the difference is that when the number of characters increases: resize(n) uses 0 to fill the excess element space, resize(size_t n, char c) uses character c to fill the extra element space. Note: When resize changes the number of elements, if the number of elements is increased, the size of the underlying capacity may be changed
  • Small, if the number of elements is reduced, the total size of the underlying space remains unchanged.
  • reserve(size_t res_arg=0): reserve space for string, without changing the number of effective elements, when the parameter of reserve is smaller than the total size of the underlying space of string, reserver will not change the capacity

Iterator

We can use begin and end to obtain the iterator of the first string of the string in this string class, and the iterator of the character after the last valid character.

begin + end begin gets an iterator of valid characters + end gets the iterator of the next position of the last valid character
Iterator
rbegin + rend rbegin gets the iterator of the last valid character + rend gets the iterator of the position before the first valid character
Generator
 string s3("hello world");
string::iterator it = s3.begin();
while (it != s3. end())
{
cout << *it << " ";
it + + ;
}
cout << endl;

output:

Like the above method of using iterator access, we can understand the it iterator here as a pointer, but the iterator is not exactly a pointer.

And like the begin() and end() mentioned above, the two functions return the iterator of the corresponding position, which can be understood as a pointer in the above example, but this function does not return a pointer.

Like the above *it is the character at this it position.

+ + it , is to let the iterator go down, and when it == “\0” the loop ends.

iterator is a pointer-like type, it may be a pointer, or it may be an encapsulated custom type.

*it can be understood as the dereference of the pointer, then you can use dereference to modify the character:

 cout << s3 << endl;
string::iterator it = s3.begin();
while (it != s3. end())
{
(*it)--;

+ + it;
}
cout << endl;

it = s3.begin();
while (it != s3. end())
{
cout << *it << " ";
it + + ;
}
cout << endl;

output:

Iterators are not only supported in the string class, any container supports iterators, and the usage is similar.

 vector<int>v;
v.push_back(1);
v.push_back(2);
v.push_back(3);
v.push_back(4);

vector<int>::iterator vit = v.begin();
while (vit != v. end())
{
cout << *vit << endl;
+ + vit;
}

The advantage of the iterator is that when we access the continuous space of the array before, we can use the “[]” method to access the data in it, but the “[]” method access The premise is that this space is continuous. If it is a linked list or a tree, such a structure cannot be accessed by “[]”, but some types like the above can implement iterators, so you can use Iterators are used to access the data in it.

Iterators provide a unified way to access simple arrays, more complex linked lists, more complex trees, and hash tables.

The usage of iterator is used together with algorithms. When we check some algorithms in the library, we find that many functions implement all types, such as the following The reverse() inversion function supports all types. Here, iterators are used to pass parameters, and templates are used to match various iterators:

Use function templates to realize parameter passing of different types of iterators and realize function overloading, so that no matter what type of iterators are passed in, this algorithm can be used.

Then, as in the above process, the algorithm can modify the data in the container through the container. After using the iterator and the template, we don’t need to go to the type of data implemented by the relationship, just pass in this type of iterator That’s it.

For the above-mentioned begin() and end() functions are sequential access functions that take the iterators in it, then if we want to reverse access, it is also possible.

Both rbegin() and rend() functions can be used to operate, and the iterators corresponding to the obtained positions are as follows:

The iterators corresponding to the above two functions are reverse_iterator iterators, so when we use iterators, the iterators we should use should be reverse_iterators.

example:

 string s3("hello world");

string::reverse_iterator vit = s3.rbegin();
while (vit != s3. rend())
{
cout << *vit << " ";
vit++;
}
cout << endl;

output:

For the iterators mentioned above, it is already a little troublesome to write, we can use auto to automatically deduce the type of this iterator:

auto vit = s3.rbegin();

A few questions:

When we implement function parameter passing, for example, now I want to pass an object of the string class into the function, then here we will call the copy constructor of the string class to create a space, here It is a deep copy, which will not only waste space, but also reduce efficiency. Our optimization solution here is to pass in the reference of this string class object, so that no deep copy will occur. If we don’t want to modify this object, then we also decorate this formal parameter with const.

But if we use the iterator of this object in the function after passing in the reference, no error will be reported; but if it is modified with const, an error will be reported! ! !

As in this example:

void func(const string & s)
{
string::reverse_iterator vit = s.rbegin();
while (vit != s. rend())
{
cout << *vit << " ";
vit++;
}
cout << endl;
}

int main()
{
string s3("hello world");

func(s3);

return 0;
} 

Error:

We found that the error reported is the fault of the template, and the error reported is very complicated because the implementation of the template is very complicated.

The amplification of permissions actually occurs here, and for the parameter passing of const objects, there are corresponding const iterators to use:

In the official document, there is const char & amp; operator[] (size_t pos_ const; this const member function, when we pass in a const object in our function, it is used in it Iterators should be const iterators:

void func(const string & s)
{
string::const_reverse_iterator vit = s.rbegin();
while (vit != s. rend())
{
cout << *vit << " ";
vit++;
}
cout << endl;
}

output:

For ordinary iterators, you can read and write, but for const iterators, you can only read operations strong>. The write function is whether the characters in the given string in the string class can be modified.

That is to say, if it is the iterator we defined, if it cannot be written, then *it cannot be changed, but it can be modified.

For the above-mentioned const problem, we using auto is very easy to use, and it will also automatically deduce this const iterator! !

void func(const string & s)
{
/*string::const_reverse_iterator vit = s.rbegin();*/
auto vit = s.rbegin();
while (vit != s. rend())
{
cout << *vit << " ";
vit++;
}
cout << endl;
}

output:

range for

For access in the string class, the most convenient way is to use the method of returning for to access the string in the string class.

Range for is a new and more concise way of traversing supported in C++11.

grammar:

for(type of variable variable name (s1): variable name to be iterated (s2))
{
    // Among them, the newly created s1 variable can be used to iterate s2
}
 string s3("hello world");
for (auto str : s3)
{
cout << str;
}

output:

Here we use auto to automatically deduce the type of this str, and then it will automatically iterate and automatically judge the end.

As in the above example, takes data from s3 in turn, and then assigns it to str, and iterates in this way.

So like the above method, If we directly modify the content of str, the string in s3 will not change:

 string s3("hello world");
for (auto str : s3)
{
str++;
}
cout << s3 << endl;

output:

If we use a reference of type this target variable, we can modify it:

 string s3("hello world");
for (auto & str : s3)
{
str++;
}
cout << s3 << endl;

output:

At this time, Every input is a reference to the characters in this string, so we can modify it.

The scope for is actually realized by the iterator at the bottom layer. It is actually implemented by using the traversal similar to the above-mentioned implementation in the iterator, and the above-mentioned sequential copying is actually copying *it to str, so that the realization of automatic iteration.

When we look at the disassembly, we can also see some of the shadows we had in the iterator:

In the return for, the begin() and end() functions are also called to judge the start and end.

That is to say, the range for we use very well is actually implemented by using iterators. Without iterators, there is no range for. Then some types do not support iterators, so it does not support range for, for example:

Ranged for is not supported in Stack:

The scope for also has limitations, because the scope for can only traverse forward, not like the previous iterator, there is also an iterator for returning home traversal, the scope for is just a fool’s iteration with forward traversal machine made.