C language data structure – tree, heap (heap sort), TOPK problem

Blogger’s homepage: @. A bright moon?

?Column series:Linear algebra, C beginners training, problem solving C, C use articles, “beginner” C++, data structure

Motto:“Don’t wait until you have nothing before making up your mind to do it”

If you think it’s good, I beg you to pay a little attention, a little love, and give pointers

Directory

Tree

tree concept

tree representation

binary tree

Binary tree concept:

special binary tree

Properties of Binary Trees

Binary tree storage structure

2. Chain storage

heap

Sequential structure of binary tree

The concept and structure of the heap

heap sort

Implementation of heap sort

build pile

heap sort

TOPK


tree

The concept of a tree

A tree is a non-linear data structure, which is a set of hierarchical relationships composed of n (n>=0) finite nodes. It is called a tree because it looks like an upside-down tree, which means it has the roots pointing up and the leaves pointing down.

Figure 1

Figure 2

Degree of node: The number of subtrees contained in a node is called the degree of the node; as shown in Figure 2: the degree of A is 6

Leaf node (terminal node): A node with a degree of 0 is called a leaf node; as shown in Figure 2: B, C, H, I, etc. are leaf nodes

Parent node (parent node): If a node contains child nodes, this node is called the parent node of its child nodes; as shown in Figure 2: A is the parent node of B

Child node (child node): the root node of the subtree contained in a node is called the child node of the node; as shown in Figure 2: B is the child node of A

Brother nodes: nodes with the same parent node are called brother nodes; as shown in Figure 2: B and C are brother nodes

Degree of the tree: In a tree, the degree of the largest node is called the degree of the tree; as shown in Figure 2: the degree of the tree is 6

The level of nodes: starting from the definition of the root, the root is the first level, the child nodes of the root are the second level, and so on

The height of the tree (the depth of the tree): the maximum level of nodes in the tree; as shown in Figure 2: the height of the tree is 4

Cousin nodes: nodes whose parents are on the same layer are cousins; as shown in Figure 2: H and I are cousin nodes

Node ancestors: from the root to all nodes on the branches of the node; as shown in Figure 2; A is the ancestor of all nodes

Descendants: Any node in the subtree rooted at a node is called a descendant of the node; as shown in Figure 2: all nodes are descendants of A

Forest: A collection of m (m>0) disjoint trees is called a forest

Representation of a tree

The tree structure is more complicated than the linear table, and it is more troublesome to store and express. Since the value range is saved, the relationship between nodes and nodes is also saved. In practice, there are many ways to represent trees, such as: parent representation , child representation, child parent representation, and child sibling representation. Here we simply understand the most commonly used child brother notation, that is, left child right brother notation.

typedef int DataType;
struct Node
{
struct Node* _firstChild1; // the first child node
struct Node* _pNextBrother; // point to its next brother node
DataType _data; // data field in the node
};

Figure 3

The use of trees in practice: representing the directory tree structure of the file system

Figure 4

Binary tree

Binary tree concept:

Binary Tree (Binary Tree) is a finite set of n (n>=0) nodes, the set is either an empty set (called an empty binary tree), or consists of a root node and two disjoint nodes, called root nodes The left subtree and the right subtree are binary trees.

Figure 5

(1) The maximum degree of a binary tree is 2;

(2) Each node has at most two subtrees;

(3) The left subtree and right subtree are in order;

(4) Even if a node in the tree has only one subtree, the left and right must be distinguished;

Note: Any binary tree is composed of the following conditions:

Figure 6

special binary tree

1. Full binary tree: A binary tree, if the number of nodes in each layer reaches the maximum value, then this binary tree is a full binary tree. That is, if a binary tree has k layers and the total number of nodes is 2^k-1, then it is a full binary tree.

2. Complete binary tree: A complete binary tree is a very efficient data structure, and a complete binary tree is derived from a full binary tree. For a binary tree with a depth of k and n nodes, it is called a complete binary tree if and only if each node corresponds to the nodes numbered from 1 to n in the full binary tree with a depth of k. Note that a full binary tree is a special kind of complete binary tree.

Figure 7

The nature of the binary tree

1. If the number of layers of the root node is specified as 1, then there are at most 2^(i-1) nodes on the i-th layer of a non-empty binary tree.

2. If the number of layers of the root node is specified as 1, then the maximum number of nodes in a binary tree with a depth of h is 2^h-1.

3. For any binary tree, if the degree is 0, the number of leaf nodes is n0, and the number of branch nodes with degree 2 is n2, then there is n0=n2 + 1

4. If the number of layers of the root node is specified as 1, the depth of a full binary tree with n nodes, h=log2^(n + 1) (log is based on 2, and n + 1 is logarithm)

5. For a complete binary tree with n nodes, if all nodes are numbered from 0 according to the order of the array from top to bottom and from left to right, then for the node with the sequence number i:

(1. If i>0, the parent number of the node at i position: (i-1)/2; i=0, i is the root node number, no parent node

(2. If 2i + 1=n otherwise there is no left child

(3. If 2i + 2=n otherwise there is no right child

Storage structure of binary tree

Binary trees can generally be stored using two structures, a sequential structure and a chain structure.

1. Sequential storage

Sequential structure storage is to use arrays for storage. Generally, arrays are only suitable for representing complete binary trees, because not complete binary trees will waste space. In reality, only the heap will use arrays for storage. Binary tree sequential storage is physically an array and logically a binary tree

Figure 8

2. Chain storage

heap

Order structure of binary tree

Ordinary binary trees are not suitable for storage in arrays, because there may be a lot of wasted space. The complete binary tree is more suitable for sequential structure storage. In reality, we usually store the heap (a binary tree) in an array of sequential structures. It should be noted that the heap here and the heap in the virtual process address space of the operating system are two different things. One is the data structure, and the other is the management in the operating system. A region of memory is segmented

The concept and structure of the heap

If there is a key set K = {k0,k1,k2,…,k[n-1]}, store all its elements in the order of a complete binary tree

In a one-dimensional array, and satisfy: Ki<=K[2*i + 1] and Ki<=K[2*i + 2](Ki>=K[2*i + 1] and Ki>=K [2*i + 2]), i=0,1,2…n

It is called a small heap (or a large heap). The heap with the largest root node is called the largest heap or large root heap, and the heap with the smallest root node is called the smallest heap or small root heap.

Properties of the heap:

The value of a node in the heap is always not greater than or not less than the value of its parent node;

The heap is always a complete binary tree

Figure 9

Heap Sort

Heap sorting is a sorting algorithm designed using the data structure of the heap. Its time complexity is O(N*logN). Compared with bubble sorting (O(N*N)), its time efficiency is very high.

Implementation of Heap Sort

1. Construct the unordered sequence into a heap, and select the large root heap (small root heap) according to the ascending (descending) requirements

2. Exchange the top element with the end element, and “sink” the largest element (minimum element) to the end of the array

3. Readjust the structure to meet the heap definition, and then continue to exchange the top element of the heap with the current end element, and repeat the adjustment + exchange steps until the entire sequence is in order

Build pile

There are two ways to build a heap, one is to adjust the heap upwards and the other is to adjust the heap downwards, no matter whether you are building a large root heap or a small root heap.

Adjust Up:

Here we are preparing to build a small root heap. Swap is an exchange function (exchange the values of two variables), which determines whether the value of the last child node child is smaller than the value of its parent node parent. If the value of child is smaller than the value of parent, then Exchange the values of the parent and child nodes, then the position of the child node child is moved to the position of the parent node parent, and the parent node parent is moved to the parent node of the parent node, and then judge the value of the child node child and the value of the parent node parent again until child Moved to the root node, if the value of the child is greater than the value of the parent, there is no need to exchange and the adjustment is complete.

void AdjustUP(int* a,int n)
{
    int child=n-1;
    int parent=(child-1)/2;
    while(child>0)
    {
        if(a[child]<a[parent])
        {
            Swap( &a[child], &a[parent]);
            child=parent;
            parent=(child-1)/2;
        }
        else
        {
            break;
        }
    }
}

Adjustment down:

Here is to build a small root heap, to determine whether the value of the parent node parent is smaller than the value of its child node child (generally, the parent node has two child nodes, and the child node here is the smaller of the two), if child If the value of the parent node is smaller than the value of the parent node, the values of the parent and child nodes are exchanged, and then the position of the parent node parent is moved to the position of the child node child, and the child node child is moved to the position of the child node of the child node, and then the value of the child node child is judged again The size of the value of the parent node parent until the value of the child is greater than the number of elements, if the value of the child is greater than the value of the parent, there is no need to exchange and the adjustment is completed.

void AdjustDown(int* a, int n, int parent)
{
    int child=parent*2 + 1;
    while(child<n)
    {
        if(child + 1<n & &a[child]>a[child + 1])
        {
            child ++ ;
        }
        if(a[child]<a[parent])
        {
            Swap( &a[child], &a[parent]);
            parent=child;
            child=parent*2 + 1;
        }
        else
        {
            break;
        }
    }
}

Adjust build up:

Building up the heap starts from the second layer until the hth layer, so its time complexity is O(N*logN)

    for(int i=1;i<n;i++)
    {
        AdjustUP(a, i);
    }

Adjust build pile down:

The downward construction is from the h-1th layer to the first layer, so its time complexity is O(N)

    for(int i=(n-1-1)/2;i>=0;i--)
    {
        AdjustDown(a, n, i);
    }

Heap Sort

Since there are two ways to build a heap, one is to adjust the heap upward and the other is to adjust the heap downward, so there are two ways to sort the heap.

HeapSort_UP(upward heap sorting):

//Time complexity is O(N*logN)
void HeapSort_UP(int* a, int n)
{
    //Building up the heap starts from the second layer to the hth layer, so its time complexity is O(N*logN)
    for(int i=1;i<n;i++)
    {
        AdjustUP(a, i);
    }
    int end=n-1;
    //After the heap is built, it needs to be adjusted downwards. The time complexity is O(N*logN)
    while(end>0)
    {
        Swap( &a[0], &a[end]);
        AdjustDown(a, end, 0);
        end--;
    }
}

HeapSort_Down(downward heap sorting):

//Time complexity is N*logN
void HeapSort_Down(int* a, int n)
{
    //Building down the heap is from the h-1th layer to the first layer, so its time complexity is O(N)
    for(int i=(n-1-1)/2;i>=0;i--)
    {
        AdjustDown(a, n, i);
    }
    int end=n-1;
    //After the heap is built, it needs to be adjusted downwards. The time complexity is O(N*logN)
    while(end>0)
    {
        Swap( &a[0], &a[end]);
        AdjustDown(a, end, 0);
        end--;
    }
}

Summary: Upward building starts from the second level to the hth level, and downward building starts from the h-1th level to the first level. Although they go through the same number of layers, there is no need to adjust the first level in the upward building. There is no need to adjust the hth layer in the downward construction (h is the last layer). There is only one element in the first layer, and there are 2^(h-1) elements in the hth layer. Therefore, their time complexity is different. It is O(N*logN) to build up the heap, and O(N) to build the heap down. Try to use the way of building down the heap to achieve heap sorting.

TOPK

TOP-K problem: Find the top K largest elements or smallest elements in the data combination. Generally, the amount of data is relatively large.
For example: the top 10 professional players, the world’s top 500, the rich list, the top 100 active players in the game, etc.
For the Top-K problem, the most simple and direct way that can be thought of is sorting, but: if the amount of data is very large, sorting is not advisable (it may not be possible to load all the data into memory at once). The best way is to use the heap to solve it. The basic idea is as follows:
1. Use the first K elements in the data set to build a heap
For the first k largest elements, build a small heap
For the first k smallest elements, build a large heap
2. Use the remaining N-K elements to compare with the top elements in turn, and replace the top elements if they are not satisfied. After comparing the remaining N-K elements with the top elements in turn, the remaining K elements in the heap are all Find the first K smallest or largest elements.

For example: here we find the 5 largest numbers among 1,000,000 numbers. For the convenience of implementation, we use rand() to generate 1,000,000 numbers. In order to be close to the actual combat project, we will write the 1,000,000 data to the file, and then read it from the file.

#include
#include 
#include 
void Swap(int* e1, int* e2)
{
    int temp=*e1;
    *e1=*e2;
    *e2=temp;
}
void AdjustDown(int* a, int n, int parent)
{
    int child=parent*2 + 1;
    while(child<n)
    {
        if(child + 1<n & &a[child]>a[child + 1])
        {
            child ++ ;
        }
        if(a[child]<a[parent])
        {
            Swap( &a[child], &a[parent]);
            parent=child;
            child=parent*2 + 1;
        }
        else
        {
            break;
        }
    }
}
void CreateData(void)
{
    //system("pwd\
");
    int n=1000;
    srand((unsigned int)time(NULL));
    const char* file="data.txt";
    FILE* fout=fopen(file, "w");
    for(int i=0;i=0;i--)
    {
        AdjustDown(kminheap, k, i);
    }
    int val=0;
    while(!feof(fin))
    {
        fscanf(fin, "%d", &val);
        if(val>kminheap[0])
        {
            kminheap[0]=val;
            AdjustDown(kminheap, k, 0);
        }
    }
    for(int i=0;i

If you still don't understand or have suggestions, you can post them in the comment area. We will discuss, learn together, and make progress together. thank you all!

The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge algorithm skill treehome page overview 47440 people are learning