[Data structure]—heap sorting + TOP-K problem (understand the underlying principle of game ranking)

Article directory

  • foreword
  • 1. Two ways to build heaps:
    • 1.1 Adjust upward to build the heap (heap sort):
      • 1.1.1 Complete code:
      • 1.1.2 Flowchart (taking small heap as an example): ascending order: building a large heap
      • 1.1.3 Flowchart (taking the small heap as an example): descending order: building a small heap
    • 1.2 Adjust the heap building downward (heap sort):
      • 1.2.1 Complete code:
      • 1.2.2 Flowchart:
  • 2. Comparison of the time complexity of the two heap building methods:
    • 2.1 Adjust the heap upwards:
    • 2.2 Adjust the heap downwards:
  • 3. Time complexity of heap sorting: O(N*logN)
  • Fourth, echo the part of the previous chapter: use the heap to make the data orderly (not recommended)
  • 5. TOP-K Questions:
    • 5.1 TOP-K problem ideas:
    • 5.2 TOP-K question code:
  • 6. File operation:
  • Summarize

Foreword

Personal homepage: @小沉早夜bald
Editor’s introduction: Welcome to my messy little planet
Column: Data Structure
This chapter content: heap sorting + TOP-K problem
For you: The sunset falls into the Zhaozhao star field, the world is suddenly late, and the mountains and rivers are autumn
Remember to comment + like + favorite + follow Oh~

Reminder: The following is the text of this article, the following case is for reference

1. Two ways to build heaps:

1.1 Adjust upward to build the heap (heap sort):

1.1.1 Complete code:

//Heap.h
#include <stdio.h>
#include <assert.h>
#include <stdlib.h>
#include <stdbool.h>
typedef int HPDataType;
typedef struct Heap
{<!-- -->
HPDataType* a;
int size;
int capacity;
}HP;
void AdjustDown(int* a, int n, int parent);
//Because this function is to be called in Test.c and Test.c contains #include"Heap.h", so include it here
void AdjustUp(HPDataType* a, int child);
void HeapInit(HP* php);
void HeapDestroy(HP* php);
void HeapPush(HP* php, HPDataType x);
void HeapPop(HP* php);
HPDataType HeapTop(HP* php);
bool HeapEmpty(HP* php);
int HeapSize(HP* php);

//Heap.c
void HeapInit(HP* php)
{<!-- -->
assert(php);
php->a = NULL;
php->size = 0;
php->capacity = 0;
}
void Swap(HPDataType* p1, HPDataType* p2)
{<!-- -->
HPDataType tmp = *p1;
*p1 = *p2;
*p2 = tmp;
}
void AdjustUp(HPDataType* a, int child)
{<!-- -->
int parent = (child - 1) / 2;
while (child > 0)
{<!-- -->
if (a[child] < a[parent])
{<!-- -->
Swap( &a[child], &a[parent]);
child = parent;
parent = (child - 1) / 2;
}
else
{<!-- -->
break;
}
}
}

void AdjustDown(int* a, int n, int parent)
{<!-- -->
int child = parent * 2 + 1;
while (child < n)
{<!-- -->
if (child + 1 < n & & a[child + 1] < a[child])
{<!-- -->
child ++ ;
}
if (a[child] < a[parent])
{<!-- -->
Swap( &a[parent], &a[child]);
parent = child;
child = parent * 2 + 1;
}
else
{<!-- -->
break;
}
}
}

//Test.c
void HeapSort(int* a, int n)
{<!-- -->
HP hp;
HeapInit( &hp);
for (int i = 1;i<n; i ++ )
{<!-- -->
//heap upward adjustment algorithm
AdjustUp(a, i);//When i=0, that is, the subscript of the child is 0. At this time, a piece of data can be regarded as a heap, so it starts to adjust upwards from 1
}
int end = n - 1;
while (end > 0)
{<!-- -->
//Small heap as an example, the first one must be the minimum number by building a heap
Swap( &a[0], &a[end]);
//Adjust the selected decimal
AdjustDown(a, end, 0);//The end here is n-1, which is the subscript of the last data and the number of previous data except the last data, so you can pass end to the past
end--;
}
}

int main()
{<!-- -->
//HPTest();
int a[] = {<!-- --> 7,8,3,5,1,9,5,4 };
HeapSort(a, sizeof(a) / sizeof(a[0]));
return 0;
}

1.1.2 Flowchart (taking small heap as an example): ascending order: building a large heap

The final result is an ascending order

1.1.3 Flowchart (taking small heap as an example): descending order: building a small heap

The final result is a descending order

1.2 Adjust down the heap (heap sort):

1.2.1 Complete code:

//Heap.h
#include <stdio.h>
#include <assert.h>
#include <stdlib.h>
#include <stdbool.h>
typedef int HPDataType;
typedef struct Heap
{<!-- -->
HPDataType* a;
int size;
int capacity;
}HP;
void AdjustDown(int* a, int n, int parent);
void AdjustUp(HPDataType* a, int child);
void HeapInit(HP* php);
void HeapDestroy(HP* php);
void HeapPush(HP* php, HPDataType x);
void HeapPop(HP* php);
HPDataType HeapTop(HP* php);
bool HeapEmpty(HP* php);
int HeapSize(HP* php);
//Heap.c
void HeapInit(HP* php)
{<!-- -->
assert(php);
php->a = NULL;
php->size = 0;
php->capacity = 0;
}
void Swap(HPDataType* p1, HPDataType* p2)
{<!-- -->
HPDataType tmp = *p1;
*p1 = *p2;
*p2 = tmp;
}
void AdjustUp(HPDataType* a, int child)
{<!-- -->
int parent = (child - 1) / 2;
while (child > 0)
{<!-- -->
if (a[child] < a[parent])
{<!-- -->
Swap( &a[child], &a[parent]);
child = parent;
parent = (child - 1) / 2;
}
else
{<!-- -->
break;
}
}
}

void AdjustDown(int* a, int n, int parent)
{<!-- -->
int child = parent * 2 + 1;
while (child < n)
{<!-- -->
if (child + 1 < n & & a[child + 1] < a[child])
{<!-- -->
child ++ ;
}
if (a[child] < a[parent])
{<!-- -->
Swap( &a[parent], &a[child]);
parent = child;
child = parent * 2 + 1;
}
else
{<!-- -->
break;
}
}
}
//Test.c
#include"Heap.h"
//Directly build the heap to adjust --- adjust the heap downward
void HeapSort(int* a, int n)
{<!-- -->
HP hp;
HeapInit( &hp);
for (int i = (n-1-1)/2; i >= 0; i--)
{<!-- -->
AdjustDown(a, n, i);
}
int end = n - 1;
while (end > 0)
{<!-- -->
Swap( &a[0], &a[end]);
AdjustDown(a, end, 0);
end--;
}
}

int main()
{<!-- -->
//HPTest();
int a[] = {<!-- --> 7,8,3,5,1,9,5,4 };
HeapSort(a, sizeof(a) / sizeof(a[0]));
return 0;
}

1.2.2 Flowchart:

Please add a picture description

2. Comparison of the time complexity of the two heap building methods:

Through the comparison of the two heap building methods, it is more recommended to use downward adjustment heap building but upward adjustment heap building method is more Easy to understand depends on personal situation

2.1 Adjust the heap upwards:

2.2 Adjust the build pile downwards:

3. Time complexity of heap sorting: O(N*logN)

Fourth, echo the part of the previous chapter: use the heap to make the data orderly (not recommended)

Use the created heap array to sort: We can use the following method – it is too troublesome to build the heap first (NlogN), and then copy the data back and forth (high space complexity) – the specific code can be seen [Data structure] – The blogger patted you and threw a “heap” binary tree (heap concept + structure + code implementation) at you, only the following changes were made in the Test.c file, and the rest remained unchanged

void HeapSort(int* a,int n)
{<!-- -->
HP hp;
HeapInit( &hp);
//Build N*logN
for (int i = 0;i<n; i ++ )
{<!-- -->
HeapPush( &hp, a[i]);
}
//N*logN
int i = 0;
while (!HeapEmpty( &hp))
{<!-- -->
int top = HeapTop( &hp);
a[i + + ] = top;
HeapPop( &hp);
}
for (int i = 0; i < n; i ++ )
{<!-- -->
printf("%d ", a[i]);
}
HeapDestroy( &hp);
}

int main()
{<!-- -->
int a[] = {<!-- --> 7,8,3,5,1,9,5,4 };
HeapSort(a, sizeof(a) / sizeof(a[0]));
return 0;
}

Five, TOP-K question:

5.1 TOP-K problem ideas:

TOP-K problem: Find the first K largest or smallest elements in the data combination, generally the amount of data is relatively large . For example: the top 10 professional players, the world’s top 500, the rich list, the top 100 active players in the game, etc.

For the Top-K problem, the most simple and direct way that can be thought of is sorting
Please add a picture description

But: If the amount of data is very large, sorting is not advisable (maybe the data cannot be loaded into memory all at once).
The best way is to use the heap to solve it. The basic idea is as follows:
Please add a picture description

5.2 TOP-K question code:

Find the top 5 largest numbers among 1000 numbers

Except Test.c made following changes to Heap.h and Heap.c according to above

//Test.c
#define _CRT_SECURE_NO_WARNINGS 1
#include"Heap.h"
#include <time.h>
void CreateNDate()
{<!-- -->
// create data
int n = 1000;
srand(time(0));//generate random number
const char* file = "data.txt";
FILE* fin = fopen(file, "w");//open the file for writing
if (fin == NULL)
{<!-- -->
perror("fopen error");
return;
}

for (size_t i = 0; i < n; + + i)
{<!-- -->
int x = rand() % 1000000;
fprintf(fin, "%d\\
", x);//write
}

fclose(fin);
}

void PrintTopK(int k)
{<!-- -->
const char* file = "data.txt";
FILE* fout = fopen(file, "r");//Reading
if (fout == NULL)
{<!-- -->
perror("fopen error");
return;
}
//malloc space
int* kminheap = (int*)malloc(sizeof(int) * k);
if (kminheap == NULL)
{<!-- -->
perror("malloc fail");
return;
}
//Read the first K data
for (int i = 0; i < k; i ++ )
{<!-- -->
fscanf(fout, "%d", & kminheap[i]);
}
//Build a heap (build a small heap)
for (int i = (k - 1 - 1) / 2; i >= 0; i--)
{<!-- -->
AdjustDown(kminheap, k,i);
}
// compare coverage
int val = 0;
while (!feof(fout))//Jump out of the loop after reading the file
{<!-- -->
fscanf(fout, "%d", & amp;val);//Start reading from k + 1 data and compare with heap top data
if (val > kminheap[0])
{<!-- -->
kminheap[0] = val;//override
AdjustDown(kminheap, k, 0);
}
}
for (int i = 0; i < k; i ++ )
{<!-- -->
printf("%d ",kminheap[i]);
}
printf("\\
");
}

int main()
{<!-- -->
CreateNDate();
PrintTopK(5);

return 0;
}

Six, file operation:

You can see C language – file operation (detailed explanation of thousands of characters)

Summary

Please add a picture description
Ending, this is the end of today’s heap sorting + TOP-K problem~, if you want to know more For more, please pay attention to me, one click and three links~