5.6 Top K high-frequency elements (LC347-M)

Algorithm:

Heap: A heap is a complete binary tree. The value of each node in the tree is not less than (or not greater than) the value of its left and right children. If the father node is greater than or equal to the left and right children, it will be a big top heap, and if it is less than or equal to the left and right children, it will be a small top heap.

If you use a queue, the row from large to small is the big top pile, and the row from small to large is the small top pile.

This question mainly involves the following three parts:

  1. To count the frequency of occurrence of elements
  2. Sort by frequency
  3. Find the first K high-frequency elements

(1) Count the frequency of occurrence of elements. For this type of problem, dict can be used for statistics.

(2) Then to sort the frequency, here we can use a container adapter which is Priority Queue.

What is a priority queue?

In fact, it is a heap disguised as a queue, because the priority queue external interface only takes elements from the head of the queue and adds elements from the tail of the queue. There is no other way to take elements. It looks like a queue.

In this question, we will use a priority queue to sort some frequencies.

Therefore,you can use the priority queue to traverse dict. The priority queue holds k elements. After the traversal is completed, the k elements in the priority queue are what we require.

So should we use a large top pile or a small top pile?

Assume using a large top stack

Therefore, a small top pile should be used.

But please note that the final priority queue is arranged from small to large, and the results we output should be arranged from large to small, soit needs to be reversed. And you have to get the key in the dict as the result ({key:value}, key is the element, value is the frequency)

New knowledge in the code:

`heapq` is a module in the Python standard library that provides the implementation of the heap data structure.

  • The `heapq` module provides a number of functions to manipulate the heap, including:
  • `heapify(iterable)`: Convert an iterable object to a heap data structure.
  • `heappush(heap, item)`: Insert elements into the heap.
  • `heappop(heap)`: Pop and return the smallest element from the heap.
  • `heapreplace(heap, item)`: Pop and return the smallest (or largest) element and insert the new element into the heap.
  • `nlargest(k, iterable)`: Returns the largest k elements in the iterable object.
  • `nsmallest(k, iterable)`: Returns the smallest k elements in the iterable object.

Note: The above functions need to be called through the `heapq` module, such as heapq.heappop(heap)

By using the `heapq` module, we can easily implement operations such as heap sorting and obtaining the largest/smallest k elements. If you have any additional questions, please feel free to ask.

To implement a large top heap:

There are two methods:
1) First create a small root heap, and then each time heappop(), you will get the arrangement from small to large, and then reverse
2) Use the opposite number to build a large root heap, and then heappop(-element). That is push(-element), pop(-element)

get function:

  • `map.get(x, 0)`: This is the dictionary’s `get` method for Get the value corresponding to the key `x` in the dictionary. If the key does not exist, the default value 0 is returned. The purpose of this is to ensure that the first time an element is encountered, its corresponding value is 0.
  • `map[x] = map.get(x, 0) + 1`: This line of code converts the dictionary `map` Add 1 to the value corresponding to the middle key `x` and store the result back in the dictionary.

map.items():

map.items() is a method of dictionary `map`, which is used to return a list containing Iterable of all key-value pairs in the dictionary. Eachkey-value pair is represented as a tuple `(key, value)`.

Debugging process:

import heapq
class Solution:
    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
        map = {}
        #Statistical frequency
        for i in range(len(nums)):
            map[nums[i]]= map.get(nums[i],0) + 1
            #{nums[i]:Number of occurrences}
        #Create a small top heap
        pri_que = []
        for key, val in map.items():
            #(key,val) is put into the list in the form of tuples
            pri_que.append((key,val))
            if len(pri_que) > k:
                heapq.heappop(pri_que)
        #Get keys in reverse order
        res = []
        #The first `-1`: This is the end value of the loop. The loop will be executed until the position before the end value, that is, until `-1`. In this example, the loop will execute to `0` because `-1` is the position before the end value.
        #The second `-1`: This is the step size of the loop. It represents an increment or decrement on each loop iteration. In this example, the step size is `-1`, which means that the value of `i` will be reduced by 1 each time the loop iterates.
        for i in range(k-1,-1,-1):
        #By using the index operation `[0]`, we can get the first element of the tuple, which is `key`, which represents a number. Therefore, the result of `heapq.heappop(pri_que)[1]` is the number popped from the mini-heap.
            res[i] = heapq.heappop(pri_que)[0]
        return res


Cause: The problem occurs in this line of code: `res[i] = heapq.heappop(pri_que)[1]`. Before this, we did not initialize the `res` list to an empty list of sufficient length. Therefore, elements of the `res` list cannot be assigned directly by index.

Change res=[] to: res = [0]*k

After modification:

import heapq
class Solution:
    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
        map = {}
        #Statistical frequency
        for i in range(len(nums)):
            map[nums[i]]= map.get(nums[i],0) + 1
            #{nums[i]:Number of occurrences}
        #Create a small top heap
        pri_que = []
        for key, val in map.items():
            #(key,val) is put into the list in the form of tuples
            pri_que.append((key,val))
            if len(pri_que) > k:
                heapq.heappop(pri_que)
        #Get keys in reverse order
        res = [0]*k
        #The first `-1`: This is the end value of the loop. The loop will be executed until the position before the end value, that is, until `-1`. In this example, the loop will execute to `0` because `-1` is the position before the end value.
        #The second `-1`: This is the step size of the loop. It represents an increment or decrement on each loop iteration. In this example, the step size is `-1`, which means that the value of `i` will be reduced by 1 each time the loop iterates.
        for i in range(k-1,-1,-1):
        #By using the index operation `[0]`, we can get the first element of the tuple, which is `key`, which represents a number. Therefore, the result of `heapq.heappop(pri_que)[1]` is the number popped from the mini-heap.
            res[i] = heapq.heappop(pri_que)[0]
        return res


reason:

Change pri_que.append((key, val)) to heapq.heappush(pri_que, (val, key)),

Just change res[i] = heapq.heappop(pri_que)[0] to res[i] = heapq.heappop(pri_que)[1]

`heapq.heappush(pri_que, (key, val))` and `pri_que.append((val, key))` are two different ways of inserting elements into the mini-heap `pri_que`.

`heapq.heappush(pri_que, (key, val))` inserts the tuple `(key, val)` into the mini-heap `pri_que` using the `heapush` function of the `heapq` module. This function performs a comparison based on the first element in the tuple and adjusts the structure of the heap based on the comparison result to maintain the nature of a mini-max heap.

`pri_que.append((val, key))` adds the tuple `(val, key)` to the end of the list `pri_que` using the list’s `append` method directly. Doing so will not maintain the nature of the small top heap and requires manual adjustment of the heap structure in subsequent operations.

In this specific code, since we want to implement a small top heap with a fixed size of `k`, we need to check whether the size of the heap exceeds `k` every time after inserting an element. If it exceeds, pop the top element of the heap to Keep the heap size no larger than `k`. Use `heapq.heappush` to automatically adjust the structure of the heap and ensure that the top element of the heap is the smallest. Using the `append` method requires manual adjustment of the heap structure to ensure that the top element of the heap is the smallest.

Correct code:

import heapq
class Solution:
    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
        map = {}
        #Statistical frequency
        for i in range(len(nums)):
            map[nums[i]]= map.get(nums[i],0) + 1
            #{nums[i]:Number of occurrences}
        #Create a small top heap
        pri_que = []

        ##Use a small top heap with a fixed size of k to scan the values of all frequencies.
        for key, val in map.items():
            #(key,val) is put into the list in the form of tuples
            heapq.heappush(pri_que, (val, key))
         
            if len(pri_que) > k:
                heapq.heappop(pri_que)
        #Get keys in reverse order
        res = [0]*k
        #The first `-1`: This is the end value of the loop. The loop will be executed until the position before the end value, that is, until `-1`. In this example, the loop will execute to `0` because `-1` is the position before the end value.
        #The second `-1`: This is the step size of the loop. It represents an increment or decrement on each loop iteration. In this example, the step size is `-1`, which means that the value of `i` will be reduced by 1 each time the loop iterates.
        for i in range(k-1,-1,-1):
        #By using the index operation `[1]`, we can get the 2nd element of the tuple, which is `key`, which represents a number. Therefore, the result of `heapq.heappop(pri_que)[1]` is the number popped from the mini-heap.
            res[i] = heapq.heappop(pri_que)[1]
        return res


The knowledge points of the article match the official knowledge files, and you can further learn related knowledge. Algorithm skill tree Home page Overview 57444 people are learning the system