Data structure: Principle and implementation of red-black tree

Article directory

  • Red black tree concept
  • red black tree properties
  • Simulation implementation of red-black tree
    • Red-black tree balance problem
  • Overall implementation and testing

This article is used to disassemble and simulate the red-black tree, laying the foundation for subsequent map and set encapsulation.

The concept of red-black trees

The red-black tree is also a binary search tree, but a new value is added inside each node to represent the color of the node. There are two types: black and red. By comparing any path from the root to the leaf, Due to the limitations of the coloring method of each node, the red-black tree can ensure that no path can be twice as long as other paths, so it is balanced.

The basic pattern of the red-black tree is shown in the figure below

Properties of red-black trees

  1. Each node is either red or black
  2. The root node is black
  3. If a node is red, then its two children are black
  4. For each node, the simple paths from that node to all its descendant leaf nodes contain the same number of black nodes.
  5. Each leaf node is black (the leaf node here refers to the empty node)

Why does a red-black tree have the number of nodes in the longest path not exceeding twice the number of the shortest path?

In fact, the reason lies in the nature of the red-black tree. In the red-black tree, there can be two identical black nodes connected together, but there will never be two red nodes connected together, and the number of black nodes on each path is the same. Yes, based on these two reasons, the longest path in a red-black tree is just a red node interspersed with a black node… and the shortest path is that all black nodes are one after another. For this reason, the above properties can be guaranteed Got it

Simulation implementation of red-black tree

Basic definition

enum Color
{<!-- -->
RED,
BLEAK
};

template<class K, class V>
struct BSTreeNode
{<!-- -->
BSTreeNode<K, V>* _left;
BSTreeNode<K, V>* _right;
BSTreeNode<K, V>* _parent;
pair<K, V> _kv;
Color_col;

BSTreeNode(const pair<K, V> & amp; kv)
:_left(nullptr)
, _right(nullptr)
, _parent(nullptr)
, _kv(kv)
,_col(RED)
{<!-- -->}
};

Why is RED used as the default value when defining information here?

Due to the properties of red-black trees, it can be known that the number of black nodes in a path is determined. When a red node is inserted, the information of the current path will be affected at most, but if a black node is inserted, then It will inevitably cause anomalies in all complete paths in the entire tree and destroy the balance in the red-black tree.

The balance problem of red-black trees

After inserting a new node, the balance of the red-black tree may be destroyed. This is discussed below on a case-by-case basis.

Definition: The current node is cur, the father node is parent, the grandfather node is grandparent, and the uncle node is uncle. The insertion problem of red-black tree focuses on uncle.

1. If the parent node is black


The simplest case, no need to do any processing, just insert it

2. cur is red, parent is red, grandfather is black, uncle exists and is red


At this time, two red nodes appear one after another. This situation is not allowed, so adjustments must be made: change parent and uncle to black, and change grandfather to red.

At this time, we need to continue to discuss the situation. If grandfather is the root node, it means that the adjustment has been completed at this time and no further adjustment is needed. Therefore, the root node is set to black. If grandfather is not the root node, and the above one The node is still red, then two more red nodes appear one after another. At this time, you need to continue to adjust, treat grandfather as cur, and then adjust.


3. cur is red, parent is red, grandfather is black, uncle does not exist or is black

Analyze based on uncle’s situation:

  1. If the uncle node does not exist, then it means that cur must be a newly inserted node. This is because the black nodes under the path must be the same. At this time, there are two situations. It may be inserted on the left and right sides of the parent.

  1. If the uncle node exists and is black, it means that the cur node must be black. It is now red because the cur subtree turned the cur node red during the adjustment process. If cur is a newly inserted node, Then the red-black tree turns out to be wrong, because the following scene does not exist

    So it must be something like this:

At this time, cur is not a newly inserted node. The newly inserted node is one of the left and right subtrees of cur. It is now red because the adjustment of the subtree below turns cur into red. It was originally black.

Then it is time to rotate: rotate the grandparent right to complete the effect of lowering the height, and then change the color.


Therefore, by combining the above processes, the implementation of the code can be completed.

 bool insert(const pair<K, V> & amp; kv)
{<!-- -->
Node* cur = _root;
Node* parent = nullptr;
// Completed according to the basic logic of searching the binary tree
if (_root == nullptr)
{<!-- -->
_root = new Node(kv);
}
else
{<!-- -->
//Insert data
while (cur)
{<!-- -->
if (cur->_kv.second > kv.second)
{<!-- -->
//If the inserted element is smaller than the current node element, insert it to the left
parent = cur;
cur = cur->_left;
}
else if (cur->_kv.second < kv.second)
{<!-- -->
//The inserted element is larger than the current node element and inserted to the right
parent = cur;
cur = cur->_right;
}
else
{<!-- -->
return false;
}
}
// At this time, parent points to the last node and cur is empty.
cur = new Node(kv);
if (parent->_kv.second > cur->_kv.second)
{<!-- -->
// If the inserted node is smaller than its parent, insert it to the left
parent->_left = cur;
cur->_parent = parent;
}
else if (parent->_kv.second < cur->_kv.second)
{<!-- -->
// If the inserted node is larger than its parent, insert it to the right
parent->_right = cur;
cur->_parent = parent;
}
else
{<!-- -->
return false;
}
}

// At this point, the insertion of the ordinary binary search tree has been completed, and it is time to adjust the height of the red-black tree.
// The termination condition is that parent is empty, or parent is already black, which means no adjustment is needed.
// parent is red, which means grandparent must exist
while (parent & amp; & amp; parent->_col == RED)
{<!-- -->
// The core of the change is the uncle, so we must first find the uncle
// Generally speaking, there are two situations. The parent may be the left or right side of the grandparent, and the uncle may be the other side.
Node* grandparent = parent->_parent;
if (parent == grandparent->_left)
{<!-- -->
Node* uncle = grandparent->_right;
// 1. Uncle exists and is red
if (uncle & amp; & amp; uncle->_col == RED)
{<!-- -->
//g
// p u
//c
\t\t\t\t\t
// change color
parent->_col = uncle->_col = BLACK;
grandparent->_col = RED;
\t\t\t\t
// Process upward
cur = grandparent;
parent = cur->_parent;
}

// 2. Uncle does not exist
else
{<!-- -->
// If cur is the left child
if (cur == parent->_left)
{<!-- -->
// g
// p
// c
\t\t\t\t\t\t
// Right-rotate the grandparent
RotateR(grandparent);
// change color
cur->_col = grandparent->_col = RED;
parent->_col = BLACK;
}
// If cur is the right child
else
{<!-- -->
// g g
// p --> c --> c
// c p p g
\t\t\t\t\t\t
// Rotate left to parent, rotate right to grandparent
RotateL(parent);
RotateR(grandparent);
// change color
cur->_col = BLACK;
parent->_col = grandparent->_col = RED;
}

// After the update, the order of parent and grandparent is messed up, and there is no need to continue adjusting, just break
break;
}
}
\t\t\t
// parent is the right child of grandparent, the same logic is repeated again
else
{<!-- -->
Node* uncle = grandparent->_left;
if (uncle & amp; & amp; uncle->_col == RED)
{<!-- -->
// change color
parent->_col = uncle->_col = BLACK;
grandparent->_col = RED;

// Continue processing
cur = grandparent;
parent = cur->_parent;
}
else
{<!-- -->
if (cur == parent->_right)
{<!-- -->
// g
// p
// c
RotateL(grandparent);
parent->_col = BLACK;
grandparent->_col = RED;
}
else
{<!-- -->
//g
// p
//c

RotateR(parent);
RotateL(grandparent);
cur->_col = BLACK;
grandparent->_col = RED;
}

break;
}
}
}
// It doesn't matter how the above changes, just make sure the root node is black.
_root->_col = BLACK;

return true;
}

void RotateL(Node* parent)
{<!-- -->
Node* subR = parent->_right;
Node* subRL = subR->_left;

parent->_right = subRL;
subR->_left = parent;

Node* parentParent = parent->_parent;

parent->_parent = subR;
if(subRL)
subRL->_parent = parent;

if (_root == parent)
{<!-- -->
_root = subR;
subR->_parent = nullptr;
}
else
{<!-- -->
if (parentParent->_left == parent)
{<!-- -->
parentParent->_left = subR;
}
else
{<!-- -->
parentParent->_right = subR;
}

subR->_parent = parentParent;
}
}

void RotateR(Node* parent)
{<!-- -->
Node* subL = parent->_left;
Node* subLR = subL->_right;

parent->_left = subLR;
if(subLR)
subLR->_parent = parent;

Node* parentParent = parent->_parent;

subL->_right = parent;
parent->_parent = subL;

if (_root == parent)
{<!-- -->
_root = subL;
subL->_parent = nullptr;
}
else
{<!-- -->
if (parentParent->_left == parent)
{<!-- -->
parentParent->_left = subL;
}
else
{<!-- -->
parentParent->_right = subL;
}

subL->_parent = parentParent;
}
}

Overall implementation and testing

enum Color
{<!-- -->
RED,
BLACK
};

template<class K, class V>
structRBTreeNode
{<!-- -->
RBTreeNode(const pair<K, V> & amp; kv)
:_left(nullptr)
, _right(nullptr)
, _parent(nullptr)
, _kv(kv)
,_col(RED)
{<!-- -->}

RBTreeNode<K, V>* _left;
RBTreeNode<K, V>* _right;
RBTreeNode<K, V>* _parent;
pair<K, V> _kv;
Color_col;
};

template<class K, class V>
classRBTree
{<!-- -->
typedef RBTreeNode<K, V> Node;
public:
RBTree()
:_root(nullptr)
{<!-- -->}

bool insert(const pair<K, V> & amp; kv)
{<!-- -->
Node* cur = _root;
Node* parent = nullptr;
// Completed according to the basic logic of searching the binary tree
if (_root == nullptr)
{<!-- -->
_root = new Node(kv);
}
else
{<!-- -->
//Insert data
while (cur)
{<!-- -->
if (cur->_kv.second > kv.second)
{<!-- -->
//If the inserted element is smaller than the current node element, insert it to the left
parent = cur;
cur = cur->_left;
}
else if (cur->_kv.second < kv.second)
{<!-- -->
//The inserted element is larger than the current node element and inserted to the right
parent = cur;
cur = cur->_right;
}
else
{<!-- -->
return false;
}
}
// At this time, parent points to the last node and cur is empty.
cur = new Node(kv);
if (parent->_kv.second > cur->_kv.second)
{<!-- -->
// If the inserted node is smaller than its parent, insert it to the left
parent->_left = cur;
cur->_parent = parent;
}
else if (parent->_kv.second < cur->_kv.second)
{<!-- -->
// If the inserted node is larger than its parent, insert it to the right
parent->_right = cur;
cur->_parent = parent;
}
else
{<!-- -->
return false;
}
}

// At this point, the insertion of the ordinary binary search tree has been completed, and it is time to adjust the height of the red-black tree.
// The termination condition is that parent is empty, or parent is already black, which means no adjustment is needed.
// parent is red, which means grandparent must exist
while (parent & amp; & amp; parent->_col == RED)
{<!-- -->
// The core of the change is the uncle, so we must first find the uncle
// Generally speaking, there are two situations. The parent may be the left or right side of the grandparent, and the uncle may be the other side.
Node* grandparent = parent->_parent;
if (parent == grandparent->_left)
{<!-- -->
Node* uncle = grandparent->_right;
// 1. Uncle exists and is red
if (uncle & amp; & amp; uncle->_col == RED)
{<!-- -->
//g
//p u
//c
\t\t\t\t\t
// change color
parent->_col = uncle->_col = BLACK;
grandparent->_col = RED;
\t\t\t\t
// Process upward
cur = grandparent;
parent = cur->_parent;
}

// 2. Uncle does not exist
else
{<!-- -->
// If cur is the left child
if (cur == parent->_left)
{<!-- -->
// g
// p
// c
\t\t\t\t\t\t
// Right-rotate the grandparent
RotateR(grandparent);
// change color
cur->_col = grandparent->_col = RED;
parent->_col = BLACK;
}
// If cur is the right child
else
{<!-- -->
// g g
// p --> c --> c
// c p p g
\t\t\t\t\t\t
// Rotate left to parent, rotate right to grandparent
RotateL(parent);
RotateR(grandparent);
// change color
cur->_col = BLACK;
parent->_col = grandparent->_col = RED;
}

// After the update, the order of parent and grandparent is messed up, and there is no need to continue adjusting, just break
break;
}
}
\t\t\t
// parent is the right child of grandparent, the same logic is repeated again
else
{<!-- -->
Node* uncle = grandparent->_left;
if (uncle & amp; & amp; uncle->_col == RED)
{<!-- -->
// change color
parent->_col = uncle->_col = BLACK;
grandparent->_col = RED;

// Continue processing
cur = grandparent;
parent = cur->_parent;
}
else
{<!-- -->
if (cur == parent->_right)
{<!-- -->
//g
// p
//c
RotateL(grandparent);
parent->_col = BLACK;
grandparent->_col = RED;
}
else
{<!-- -->
//g
//p
//c

RotateR(parent);
RotateL(grandparent);
cur->_col = BLACK;
grandparent->_col = RED;
}

break;
}
}
}
// It doesn't matter how the above changes, just make sure the root node is black.
_root->_col = BLACK;

return true;
}

void RotateL(Node* parent)
{<!-- -->
Node* subR = parent->_right;
Node* subRL = subR->_left;

parent->_right = subRL;
subR->_left = parent;

Node* parentParent = parent->_parent;

parent->_parent = subR;
if(subRL)
subRL->_parent = parent;

if (_root == parent)
{<!-- -->
_root = subR;
subR->_parent = nullptr;
}
else
{<!-- -->
if (parentParent->_left == parent)
{<!-- -->
parentParent->_left = subR;
}
else
{<!-- -->
parentParent->_right = subR;
}

subR->_parent = parentParent;
}
}

void RotateR(Node* parent)
{<!-- -->
Node* subL = parent->_left;
Node* subLR = subL->_right;

parent->_left = subLR;
if(subLR)
subLR->_parent = parent;

Node* parentParent = parent->_parent;

subL->_right = parent;
parent->_parent = subL;

if (_root == parent)
{<!-- -->
_root = subL;
subL->_parent = nullptr;
}
else
{<!-- -->
if (parentParent->_left == parent)
{<!-- -->
parentParent->_left = subL;
}
else
{<!-- -->
parentParent->_right = subL;
}

subL->_parent = parentParent;
}
}

void inorder()
{<!-- -->
_inorder(_root);
cout << endl;
}

bool isbalance()
{<!-- -->
return _isbalance(_root);
}

private:
bool _check(Node* root, int blacknum, const int RefVal)
{<!-- -->
// Determine whether the number of black paths is equal
if (root == nullptr)
{<!-- -->
if (blacknum != RefVal)
{<!-- -->
cout << "The number of black nodes is not equal" << endl;
return false;
}
return true;
}

// Determine whether there are consecutive red nodes
if (root->_col == RED & amp; & amp; root->_parent->_col == RED)
{<!-- -->
cout << "There are consecutive red nodes" << endl;
return false;
}

if (root->_col == BLACK)
{<!-- -->
blacknum + + ;
}

return _check(root->_left, blacknum, RefVal)
& amp; & amp; _check(root->_right, blacknum, RefVal);
}

// Determine whether the red-black tree is balanced
bool _isbalance(Node* root)
{<!-- -->
// If the root node is red, it means something is wrong
if (root->_col == RED)
{<!-- -->
cout << "The root node is red" << endl;
return false;
}

// Count how many black nodes there are in the path
int RefVal = 0;
Node* cur = root;
while (cur)
{<!-- -->
if (cur->_col == BLACK)
RefVal + + ;
cur = cur->_left;
}

// Determine whether the black nodes in the path are equal
return _check(root, 0, RefVal);
}

void _inorder(Node* root)
{<!-- -->
if (root == nullptr)
return;
_inorder(root->_left);
cout << root->_kv.second << " ";
_inorder(root->_right);
}

Node* _root = nullptr;
};
int main()
{<!-- -->
const int N = 100000;
vector<int> v;
v.reserve(N);
srand(time(0));

for (size_t i = 0; i < N; i + + )
{<!-- -->
v.push_back(rand() + i);
}

RBTree<int, int> t;
for (auto e : v)
{<!-- -->
cout << "Insert:" << e << "->";
t.insert(make_pair(e, e));
cout << t.isbalance() << endl;
}

cout << t.isbalance() << endl;

return 0;
}