The use of set/multiset and map/multimap in C++

Associative container

In the preliminary stage, we have already been exposed to some containers in STL:

For example: vector, list, deque, forward_list (C++11), etc. These containers are collectively called sequential containers because their underlying data structure is a linear sequence data structure, and the elements themselves are stored in them.

The containers we want to learn about today are called associative containers:

Associative containers are also used to store data. Different from serial containers, they store key-value pairs of structures, which are faster than sequences when retrieving data. Containers are more efficient.

Key-value pairs

Used to represent a structure with one-to-one correspondence. This structure generally contains only two member variables key and value. The key represents
Table key value, value represents the information corresponding to the key. For example: If we want to create a dictionary for English-Chinese translation, then the dictionary must contain
There are English words and their corresponding Chinese meanings, and there is a one-to-one correspondence between English words and their Chinese meanings, that is, through this application
The corresponding Chinese meaning of this word can be found in the dictionary.

template <class T1, class T2>
struct pair
{
    typedef T1 first_type;
    typedef T2 second_type;
    T1 first;
    T2 second;
    pair(): first(T1()), second(T2())
    {}
    pair(const T1 & amp; a, const T2 & amp; b): first(a), second(b)
    {}
};

Pair application:

Pair combines 2 data into one data. Pair can be used when such a requirement is needed.
(1) The map in STL is to save the key and value together (generally first corresponds to key and second corresponds to value)
(2) When a function needs to return 2 data, you can choose pair

So you can think of pair as a key-value pair class provided in the library.

Tree-structured associative container

Depending on the application scenario, STL implements a total of two different structures of associative containers: Tree structure and hash structure.

There are four main types of tree-structured associative containers:

map, set, multimap, multiset

What these four types of containers have in common is that they use a balanced binary search tree (i.e., a red-black tree) as its underlying structure. The elements in the container are an ordered sequence. If the data in an ordinary search binary tree is ordered or close to ordered, The fork search tree will degenerate into a single branch tree, and searching for elements is equivalent to searching for elements in the sequence table, which is inefficient. The balanced binary tree, as its name implies, is to make it balanced on the basis of an ordinary search binary tree (it requires adjusting the nodes in the tree).

set

Get to know set

Document introduction

First let’s take a look at the set:

Set is a class template with three template parameters, but when you use it, you generally only need to control the first template parameter. The last two have default values. If you want to change the sorting method of its underlying search tree, you can do it yourself. Pass the functor for the second comparison method (default is ascending order).

Introduction to set

1. Set is a container that stores elements in a certain order
2. In the set, the value of the element also identifies it (value is the key, type is T), and each value must be unique.
Elements in a set cannot be modified in the container (elements are always const), but they can be inserted or deleted from the container.
3. Internally, the elements in a set are always sorted according to specific strict weak ordering criteria indicated by their internal comparison objects (type comparisons).
4. Set containers are generally slower than unordered_set containers in accessing individual elements by key, but they allow access to individual elements based on order.
Subsets are iterated directly.
5. Set is implemented using binary search tree (red-black tree) at the bottom level.
Note:
1. Different from map/multimap, map/multimap stores real key-value pairs , and set only stores
value, but what is actually stored at the bottom is a key-value pair consisting of .
2. When inserting an element into the set,you only need to insert the value, and there is no need to construct a key-value pair.
3. The elements in set cannot be repeated (so set can be used to remove duplicates).
4. Use the iterator of set to traverse the elements in the set to obtain an ordered sequence.
5. The elements in the set are compared according to less than by default
6. To find an element in set, the time complexity is: O(logn)
7. Elements in setare not allowed to be modified
8. The bottom layer in the set is implemented using binary search tree (red-black tree).

So what we said above:

The underlying structure of these associative containers is actually the binary search tree we learned before. The set actually corresponds to the K model in the application of searching the binary tree we learned earlier. The map we will learn next corresponds to KV. Model.

Usage of set

Constructor

Construct an empty set and insert elements:
#include<set>
#include <iostream>
using namespace std;
void testset1()
{
set<int> s;
s.insert(4);
s.insert(6);
s.insert(3);
s.insert(5);
s.insert(2);
s.insert(1);
s.insert(1);
s.insert(1);
\t
for (set<int>::iterator it = s.begin(); it != s.end(); it + + )
{
cout << *it << " ";
}
cout << endl;
}

int main()
{
testset1();
return 0;
}

Iterator range construction:
void testset1()
{
vector<int> v = { 4,6,3,5,2,1,1,1 };
set<int> s(v.begin(),v.end());
\t
for (set<int>::iterator it = s.begin(); it != s.end(); it + + )
{
cout << *it << " ";
}
cout << endl;
}

int main()
{
testset1();
return 0;
}

But we see that what is printed here is 1,2,3,4,5,6.

We inserted unordered and inserted 3 1’s. Its bottom layer is a search binary tree. Search binary trees generally cannot have duplicate values, so if the same value is inserted multiple times, the insertion will fail ( Or it can be understood that it will automatically remove duplicates). In addition, we know that a binary search tree is also called a binary sorting tree. An ascending sequence can be obtained by in-order traversal, so the default traversal here is printed out in order. So it can be considered that set can achieve sorting + deduplication

It can be seen that the elements in the set cannot be modified. Its bottom layer is a search tree. If the value in the search tree is modified arbitrarily, there is no guarantee that it will still be a search tree after modification, so it cannot be modified.

find and count

find can be used with erase:

void testset1()
{
vector<int> v = { 4,6,3,5,2,1,1,1 };
set<int> s(v.begin(),v.end());

set<int>::iterator it = s.find(3);
if (it != s.end())
{
s.erase(it);
}

for (set<int>::iterator it = s.begin(); it != s.end(); it + + )
{
cout << *it << " ";
}
cout << endl;
}

In fact, you can also use another interface – count

Count is the number of statistical elements, but duplicate values are not allowed in set, so the result is only 1 and 0. If it exists, it is 1, and if it does not exist, it is 0

void testset2()
{
vector<int> v = { 4,6,3,5,2,1,1,1 };
set<int> s(v.begin(), v.end());

if (s.count(3))
cout << "3在" << endl;
else
cout << "3 is not there" << endl;
}

int main()
{
testset2();
return 0;
}

erase

void testset3()
{
std::set<int> myset;

for (int i = 1; i < 10; i + + ) myset.insert(i * 10); // 10 20 30 40 50 60 70 80 90
std::set<int>::iterator it=myset.begin();
myset.erase(it);//Delete 10
myset.erase(40);//Delete 40
it = myset.find(60);
myset.erase(it, myset.end());//Delete 60-90
cout << "myset contains:";
for (it = myset.begin(); it != myset.end(); + + it)
cout << ' ' << *it;
cout << '\
';
}

lower_bound and upper_bound

void testset4()
{
std::set<int> myset;
std::set<int>::iterator itlow, itup;

for (int i = 1; i < 10; i + + ) myset.insert(i * 10); // 10 20 30 40 50 60 70 80 90

itlow = myset.lower_bound(25); // Iterator that is greater than or equal to the val value position, itlow points to 30
itup = myset.upper_bound(70); // Iterator greater than val value position, itup points to 80

// [25, 70]
myset.erase(itlow, itup); // 10 20 80 90
for (auto e : myset)
{
cout << e << " ";
}
cout << endl;
}

eual_range:

Returns a pair, pair.first is lower_bound(val), pair.second is upper_bound(val).

Since the set element is unique, the longest interval length is 1

void testset5()
{
std::set<int> myset;

for (int i = 1; i <= 5; i + + ) myset.insert(i * 10); // myset: 10 20 30 40 50

std::pair<std::set<int>::const_iterator, std::set<int>::const_iterator> ret;

ret = myset.equal_range(35);
std::cout << "the lower bound points to: " << *ret.first << '\
'; // >= val
std::cout << "the upper bound points to: " << *ret.second << '\
'; // > val
cout << endl;
ret = myset.equal_range(30);
std::cout << "the lower bound points to: " << *ret.first << '\
'; // >= val
std::cout << "the upper bound points to: " << *ret.second << '\
'; // > val
}

multiset

There is also a multiset in the header file . The biggest difference between it and set is that the key values stored in it are allowed to be redundant. It can be considered that it is sorted, but does not deduplicate.

How to insert key value if it is allowed to be redundant?

Then just continue to insert. Even if you insert an existing value, in the search tree, we say that the larger value should be inserted into the right subtree, and the smaller value should be inserted into the left subtree. If they are equal now, it will actually be inserted into either side. Yes, it can be on the left or on the right. Anyway, after the balanced binary tree is inserted, the nodes will be adjusted to balance both sides.

find:

If we find a certain value in a multiset, if there are multiple identical values, which one is searched for when finding?
It is stipulated that what it finds is the first one encountered by in-order traversal. When it traverses, the bottom layer performs in-order traversal, so it is printed in order.

void testset6()
{
// sort
multiset<int> s;
s.insert(5);
s.insert(2);
s.insert(6);
s.insert(1);
s.insert(1);
s.insert(2);
s.insert(1);
s.insert(5);
s.insert(2);

cout << "multiset contains:" << endl;
for (auto e : s)
{
cout << e << " ";
}
cout << endl<<endl;

// If there are multiple values, find returns the first val in order
cout << "multiset begins with 2:" << endl;
multiset<int>::iterator it = s.find(2);
while (it != s.end())
{
cout << *it << " ";
+ + it;
}
cout << endl<<endl;

cout << "the number of 1: ";
cout << s.count(1) << endl << endl;

// [>=val, >val)
auto ret = s.equal_range(2);
s.erase(ret.first, ret.second);
cout << "after erasing 2, multiset contains:"<< endl;
for (auto e : s)
{
cout << e << " ";
}
cout << endl << endl;

size_t n = s.erase(1);
cout << "the number of 1: " << n;
cout << "after erasing 1, multiset contains:" << endl;
for (auto e : s)
{
cout << e << " ";
}
cout << endl << endl;
}

map

Get to know map

Document introduction

Map actually corresponds to the KV model of searching a binary tree, and it stores key-value pairs in the structure. That is, it is used to represent a structure with a one-to-one correspondence. This structure generally contains only two member variables, key and value. Key represents the key value, and value represents the information corresponding to the key.

1. Map is an associative container, which stores elements composed of key and value in a specific order (compared by key).
2. In a map, the key value key is usually used to sort and uniquely identify elements, and the value value stores the information associated with this key value key.
content. The types of key and value may be different, and inside the map, key and value are determined by member types.
The value_types are bound together and given an alias name of pair: typedef pair value_type;
3. Internally, the elements in the map are always compared and sorted according to key value key.
4. Accessing a single element by key value in map is usually slower than unordered_map container, but map allows ordering
Iterate directly over the elements (that is, when iterating over the elements in the map, you can get an ordered sequence).
5. Map supports subscript access operators, that is, by putting the key in [], you can find the value corresponding to the key.
6. Maps are usually implemented as binary search trees (more precisely: balanced binary search trees (red-black trees)).

Use of map

insert:

You can see that what is inserted is a value of type value_type, and value_type is a pair. pair.first(key) is const modified and cannot be modified (that is, the key must be unique), and pair.second(value) can.

void test_map1()
{
map<string, string> dict;
dict.insert(pair<string, string>("sort", "sort"));
dict.insert(pair<string, string>("insert", "insert"));
dict.insert(make_pair("right", "right")); // Recommend this
dict.insert(make_pair("string", "string"));
dict.insert(make_pair("string", "string11111"));

for (map<string, string>::iterator it = dict.begin(); it != dict.end(); it + + )
{
cout << it->first << ":" << it->second << endl;
}
}

What is this make_pair? It is actually a function template provided in the library, which can help us create a pair object. The advantage of using it is that we do not need to specify the type ourselves, because the template can automatically deduce the type.

In addition, the key in the map must be unique and cannot be repeated, and the value can be repeated. One key can only correspond to one value, and one value can correspond to multiple keys

Return value of insert:

insert will return a pair:

pair.second is a bool value, bool is true if the insertion is successful, and false if the insertion fails.

pair.first is an iterator. If the insertion is successful, the iterator points to the newly inserted key-value pair. If it fails, it points to the existing key-value pair.

void test_map1()
{
map<string, string> dict;
dict.insert(pair<string, string>("sort", "sort"));
dict.insert(pair<string, string>("insert", "insert"));
dict.insert(make_pair("right", "right")); // Recommend this
dict.insert(make_pair("string", "string"));
auto p = dict.insert(make_pair("string", "string11111"));

cout << p.first->first << ":" << p.first->second << endl;
}

There is another issue to note:

The first template parameter of insert is const key_type: But this paragraph inserts a pair of type pair. The first parameter is not modified by const. Why can it still be inserted?

Therefore, the pair type can be copy-constructed even if there are differences, for example:

So as long as these two U and V (both const char* here) can initialize this->first and this->second, they can be copied, so this copy construct can be a construct or a copy construct :

When U and V have the same type as this->first and this->second, it is copy construction. If the types are different but can be converted, it is construction.

This more conveniently explains why make_pair is recommended:

make_pair is a function template. You can deduce the template parameters by yourself. However, before using pair, you need to manually write out the template type (show instantiation):

Therefore, after make_pair, it does not matter whether it is pair or pair type. In the end, the default constructor will be called to construct pair.

find:

Similar to set’s find, if it finds the iterator that returns the position, if it cannot find it, it returns end.

Use map to write a program for counting times:

void test_map2()
{
string arr[] = { "apple", "watermelon", "apple", "watermelon", "apple", "apple", "watermelon", "apple", "banana", "apple", "banana" } ;
map<string, int> countMap;
for (auto & str : arr)
{
auto ret = countMap.find(str);
if (ret == countMap.end())
{
// Not found means first occurrence, insert
countMap.insert(make_pair(str, 1));
}
else
{
// times + +
ret->second + + ;
}
}

for (auto & amp; kv : countMap)
{
cout << kv.first << ":" << kv.second << endl;
}
}

Understanding [] in map

So the above search code can be replaced with:

void test_map3()
{
string arr[] = { "apple", "watermelon", "apple", "watermelon", "apple", "apple", "watermelon", "apple", "banana", "apple", "banana" } ;
map<string, int> countMap;

for (auto & str : arr)
{
countMap[str] + + ;
}

for (auto & amp; kv : countMap)
{
cout << kv.first << ":" << kv.second << endl;
}
}

Take a closer look at operator[]:

The document also explains that calling this overloaded [] is equivalent to executing this code

(*((this->insert(make_pair(k,mapped_type()))).first)).second

We actually adjusted insert here, so let’s take a closer look at the return value of insert.

Insertion has two situations: success and failure, because the key value does not allow redundancy, it returns a pair, where the first template parameter is an iterator and the second one is bool.

When the insertion is successful, the first of the pair is the iterator pointing to the newly inserted element, and the second is true. When the insertion fails (in fact, the inserted key already exists), then its first is the same one that already exists in the container. Iterator of equivalent key elements, second is false. So the following bool actually indicates whether the insertion is successful or failed.

Then let’s look at this function again:

In addition, we see that insert here also uses the make_pair function to create a pair object. The first parameter is the key we put in [], and the second parameter is passed to adjust the default structure of the value type ( Built-in types also have constructors), the value is int, and the default constructed value is 0, so the number of times we insert it for the first time is 0. Then return this number of references, and the + + outside it is exactly 1.
Then if the same key is inserted later, the insertion will fail. The number will not be changed to 0, but a reference to the number (corresponding to second in the pair) will still be returned. We will continue from 1 + +

Use [] to insert, modify, and delete:

void test_map4()
{
map<string, string> dict;
dict.insert(make_pair("sort", "sort"));
dict.insert(make_pair("insert", "insert"));
dict.insert(make_pair("right", "right"));
dict.insert(make_pair("string", "string"));
dict["girl"];//Insert
dict["boy"] = "boy"; //Insert and modify
dict["girl"] = "girl"; //Modify
cout << dict["girl"] << endl;//Search
}

multimap

Like set, map and multimap

Compared with map, multimap allows key redundancy, so there is no [] in its interface

void test_map5()
{
multimap<string, string> dict;
dict.insert(make_pair("sort", "sort"));
dict.insert(make_pair("insert", "insert"));
dict.insert(make_pair("boy", "boy"));
dict.insert(make_pair("left", "left"));
dict.insert(make_pair("right", "right"));
dict.insert(make_pair("string","string1"));
dict.insert(make_pair("string", "string2"));

for (multimap<string, string>::iterator it = dict.begin(); it != dict.end(); it + + )
{
cout << it->first << ":" << it->second << endl;
}
}

More can be found in the documentation