[C++] STL – the use of set/multiset and map/multimap

Article directory

  • 1. Associative container
  • 2. Associative container of tree structure
  • 3. set
    • 3.1 Understanding set
    • 3.1 Use of set
  • 4. multiset
  • 5. map
    • 5.1 Understanding map
    • 5.2 pairs
    • 5.3 The use of maps
      • Understanding of [] in map
  • 6. multimap

1. Associative container

In the initial stage, we have already touched some containers in STL


For example: vector, list, deque, forward_list (C + + 11), etc. These containers are collectively called serial containers, because the bottom layer is a linear sequence data structure, which stores the elements themselves.

And the few containers we are going to learn today are called associative containers, so what is an associative container? How is it different from a serial container?

The associative container is also used to store data. The difference from the serial container is that it stores the key-value pairs of the structure, which is more convenient than the sequential container when retrieving data. higher efficiency.

2. Associative container of tree structure

According to different application scenarios, STL implements two types of associative containers with different structures:tree structure and hash structure.

There are four main types of associative containers in tree structure:

map, set, multimap, multiset.
These four types of containers have in common that they use a balanced binary search tree (that is, a red-black tree) as their underlying structure, and the elements in the container are an ordered sequence.
We will talk about the balanced binary tree later. In fact, we also mentioned it in the previous articles when we explained the search for the binary tree. If the data in the ordinary search binary tree is ordered or close to the ordered binary search The tree will degenerate into a single-branched tree, and searching for elements is equivalent to searching for elements in the sequence table, which is inefficient.
The balanced binary tree, as the name implies, is to make it balanced on the basis of the ordinary search binary tree (the nodes in the tree need to be adjusted).
We will talk about this in detail later.

Each container is described in turn below.

3. set

3.1 Understanding set

First let’s look at the set

is still a class template with three template parameters, but in fact, we usually only need to care about the first template parameter when we use it.
Because the latter two are the default values, of course, if you want to change the sorting method of the underlying search tree, you can pass the functor of the second comparison method (the default is ascending).

For a detailed introduction to set, you can go to the document

But it’s in English, you can use the translation tool to view

3.1 Use of set

Since we have learned several containers in STL before, the use of these containers here should actually be relatively easy for us. So I will not introduce its interfaces one by one as carefully as before. I believe that everyone can understand many things by reading the document at this stage.

You can take a look at the interfaces it provides, of course some of them are not commonly used.

Then we said above:

The underlying structure of these associative containers is actually the binary search tree we learned before, which is just a balanced search tree.
The set actually corresponds to the K model in the binary tree search application we learned earlier, and the map to be learned below corresponds to the KV model.

Then let’s get familiar with its use:

Take a look at its constructor


Then let’s construct an empty set and insert some values
First use set to include the header file for #include

We use insert to insert several elements (here we see that his interface actually has no push or anything, and generally linear sequential containers have push).
Of course, there are other versions of inset, which are not commonly used, and you can understand them yourself.

Then I want to print it out to see, then we can use iterators to traverse it


But we see that 1, 2, 3, 4 are printed out here.
But we have inserted several values, and they are out of order, so what’s going on here?
, the above said that its bottom layer is a balanced binary tree, then we have learned the binary search tree and actually understand what’s going on:
Searching binary trees generally cannot have duplicate values, so if you insert the same value multiple times, the subsequent insertion will fail (or it can be understood that it will automatically deduplicate); in addition, we know A binary search tree is also called a binary sort tree, and an ascending sequence can be obtained by in-order traversal, so here the default traversal is printed out in order.
So it can be considered that set can achieve sorting + deduplication
If iterators are supported, we can use range for.

Then think about it, can we modify the value in it?


Of course it is not possible, because we said that its bottom layer is a search tree. If we can modify the values in the search tree arbitrarily, can it still be guaranteed to be a search tree after modification?
Not necessarily, because the search tree needs to satisfy the size rule for .
So it cannot be modified.

If we want to judge whether an element is in the set object

can use find interface


But in fact, another interface can be used – count

count is the number of statistical elements, but our set does not allow duplicate values, so the result is only 1 and 0, if it exists, it is 1, if it does not exist, it is 0

We won’t introduce the rest of the interfaces here. With the previous foundation, I believe everyone will be able to see them at a glance. There are still some that are not very useful, or we can’t understand them very well at this stage.

4. multiset

Then in the header file , set comes out, and there is a multiset


Multiset is called multiple sets, and multi has multiple meanings.

Or these interfaces.

So what is the difference between it and set?

, the biggest difference between it and set is that the key value stored in it is allowed to be redundant.
Let’s take a look

It can be considered that it is sorted, but not deduplicated
How about its insertion if the key-value redundancy is allowed?
Then just continue inserting. Even if you insert an existing value, we say that the large value should be inserted into the right subtree, and the small value should be inserted into the left subtree. Yes, it can be on the left or on the right. Anyway, to balance the binary tree, after insertion, the nodes will be adjusted to make both sides balanced (we will talk about balancing the binary tree later).

Then I want to tell you about find:

Now that key-value redundancy is allowed, there is a problem:
If we find a certain value, there are multiple same values with it, which one is the one we are looking for when we find it?
, stipulates that what it finds is the first one encountered by in-order traversal.
When it traverses, the bottom layer is in-order traversal, so the printing is in order.
We can verify this:
Let's go first
Now there are 4 1s, we find(1), and then start from the returned iterator position, if we can print all 4 1s, it proves that it is the first in the sequence 1

Of course, we can use count to count the number of each key value:

However, in practical applications, multiset is generally used less.

5. map

5.1 Understanding map

In fact, the map corresponds to the KV model of the search binary tree, which stores the key-value pairs of the structure. That is, it is used to represent a structure with a one-to-one correspondence relationship. Generally, the structure only contains two member variables key and value, key represents the key value, and value represents the information corresponding to the key.


The interface of map is almost similar to that of set, and we will talk about what needs to be paid attention to later.

5.2 pair

Before learning the use of map, let’s learn a class/structure template in STL – pair

Let’s take a look at the definition of pair in SGI-STL:

template <class T1, class T2>
struct pair
{<!-- -->
typedef T1 first_type;
typedef T2 second_type;
T1 first;
T2 second;
pair()
: first(T1())
, second(T2())
{<!-- -->}
pair(const T1 & amp; a, const T2 & amp; b)
: first(a)
, second(b)
{<!-- -->}
};

Application of pair:

Pair is to combine 2 data into one data, and pair can be used when such a requirement is needed.
(1) The map in STL is to store the key and value together (generally first corresponds to the key, and second corresponds to the value).
(2) Another application is that when a function needs to return 2 data, you can choose pair

So you can think of a pair as a class of key-value pairs provided in the library.

5.3 Use of map

Then we see the insert of the map:


Its first version actually accepts an object of type pair.

The key is const modified and cannot be modified (that is, the key must be unique), but the value can.

Then let’s write the code. We can use map to write a dictionary of English-Chinese translation that we realized before.

map is included in the header file



Of course inserting can also use this

So what is this make_pair?
He is actually a function template provided in the library.

It can help us create a pair object. The advantage of using it is that we don’t need to specify the type ourselves, because the template can automatically deduce the type.
Then let’s traverse and print:

Of course we can also write:

Can everyone see this?

Can the last one be inserted successfully?

No, because the key must be unique and cannot be repeated, and the value can be repeated.
A key can only correspond to one value, and a value can correspond to multiple keys

You can also directly use the find key to find the value

Then we also use map to write a program for counting the number of times:

The same idea as before, I will directly upload the code

Understanding of [] in map

Then, we found that map also provides such an interface


What is its function?

First, the parameter it receives is the key key. If the input key matches the key of the element in the container, it returns a reference to the value corresponding to the key.
So we can also write the statistics as follows

One line of code in the for loop is done

Then let’s study this code next:


In fact, the document also explained that calling this overloaded [] is equivalent to executing this code
(*((this->insert(make_pair(k, mapped_type())).first)).second
So the function looks something like this:

We saw that the insert was actually adjusted, so let’s take a closer look at the return value of the insert

Let me explain again:
First of all, you need to know that if you insert, there are still two cases of success and failure, because the key value does not allow redundancy; it returns a pair, the first template parameter is an iterator, and the second is bool.
When the insertion is successful, the first of the pair is an iterator pointing to the newly inserted element, and the second is true. When the insertion fails (in fact, the inserted key already exists), then its first is the same as the existing one in the container. Iterator of the equivalent key elements, second is false.
So the latter bool actually identifies the success or failure of the insertion.
Then let’s look at this function again

Let’s disassemble it, or it won’t look good

Then let’s look at the insert inside

We see that insert also uses the make_pair function to create a pair object. The first parameter is the key we put in [], and what does the second parameter pass?
We see that it is the default construction of the type whose value is adjusted (we said before that the built-in type also needs to have a constructor after the template), then our value is int, and the value of the default construction is 0, so we The number of times it happens to be inserted for the first time is 0. Then return the reference to the value corresponding to this number of times, which is exactly 1 when facing it + + outside.
Then if the same key is inserted later, the insertion will fail, and the number of times will not be changed to 0, but the reference of the number of times (corresponding to the second in the pair) will still be returned. We can continue from 1 to + +.
Of course, some of the types given above have been typedef’d, we don’t look very good, but it provides a table, you can check it out

So we can simplify it a bit so that it’s easier to see

So we can also play like this:


The operations of inserting, searching, and modifying are written like this

Summary

1. The elements in the map are key-value pairs
2. The key in the map is unique and cannot be modified
3. By default, the keys are compared by less than
4. If the elements in the map are traversed by an iterator, an ordered sequence can be obtained
5. The bottom layer of the map is a balanced search tree (red-black tree), and the search efficiency is relatively high

o

(

l

o

g

2

N

)

O(log_2N)

O(log2?N)
6. Support the [] operator, the actual insertion search in operator[]

6. multimap

That’s the same as set:

In addition to map and multimap

let’s take a look


Similarly, compared with map, multimap allows key redundancy

So there is no [] in its interface

Simply demonstrate:


Of course, it is also possible to have the same key and value.

Then I won’t say much about the rest, you can check the documentation yourself if you need it.