LRU and LFU cache implementation (c++)

  • background
  • LRUs
    • topic
    • data structure
    • Implementation
      • Definition of doubly linked list node
      • LRUCache class
        • initialization
        • main interface implementation
  • LFU
    • topic
    • data structure
    • Implementation
      • Definition of node class and linked list class
      • LFUCache class
        • member variable definition
        • interface implementation
  • summary

Background

When a process applies for memory through the malloc function, the allocated memory is virtual memory, not the actual physical memory, and the memory management unit has not completed virtual memory to physical memory at this time mapping. When the program needs to access the virtual memory and finds that the memory is not mapped to the physical memory, a page fault interrupt will be generated. After entering the page fault interrupt, judge whether there is physical memory that can be allocated. If there is, then the physical page will be allocated and the mapping from virtual memory to physical memory will be established through the page table of the MMU.

If there is no physical memory that can be allocated, background memory reclamation and direct memory reclamation will be performed sequentially. If the demand cannot be met after memory reclamation, will be triggered >OOM mechanism to kill processes that occupy a large amount of memory.

If there is no free page in the physical memory that can be allocated after a page fault interrupt, you need to use the page replacement algorithm to select the physical page (dirty page) and place it on the disk. This article mainly introduces LRU and LFU.

The LRU algorithm selects the memory page that has not been accessed for the longest time to set the ring, which is generally achieved by maintaining a linked list. The head area of the linked list is the frequently accessed memory, and the tail area is the memory that is not frequently accessed. It is to reclaim the memory at the end of the linked list first. The LFU algorithm selects the page with the least number of visits for replacement, if the number of visits is the same, use the LRU algorithm to further determine which page to replace.

LRU

Title

Data structure

The LRU algorithm ensures that each time an element is accessed, the element needs to be inserted into the head of the linked list. If the current capacity is greater than the maintained capacity, the element needs to be deleted from the tail. The insertion and deletion of the linked list is very convenient, and the time complexity meets the requirements; but the search Elements need to be traversed. At this time, hash table mapping can be considered, and the query can also meet the time complexity of O(1).
As shown in the figure below: the mapping relationship of key–>node is stored in the hash table, and the reason why the key is also included in the doubly linked list is that when the linked list is subsequently deleted, the corresponding elements in the hash table need to be deleted at the same time .

Concrete implementation

Definition of doubly linked list nodes

Four elements: key, value, predecessor node and successor node

class DoubleList{<!-- -->
public:
    int key;
    int value;
    DoubleList *prev;
    DoubleList *next;
    DoubleList(): key(0), value(0), prev(nullptr), next(nullptr){<!-- -->}
    DoubleList(int _key, int _value): key(_key), value(_value), prev(nullptr), next(nullptr){<!-- -->}
};

LRUCache class

Initialization

Define initialized member variables: head and tail nodes, hash table, capacity, current size

class LRUCache {<!-- -->
public:
    LRUCache(int capacity) : capacity_(capacity), size_(0){<!-- --> //data structure initialization
        head = new DoubleList();
        tail = new DoubleList();
        head->next = tail;
        tail->prev = head;
        //********************
        //
        //********************
    }
private:
    DoubleList *head;
    DoubleList *tail;
    unordered_map<int, DoubleList *> map;
    int size_;
    int capacity_;
};

Main interface implementation

put()
If there is no such element, new out the element node, add it to the head of the linked list, and add the mapping relationship to the hash table. Because the newly added element needs to judge whether the size exceeds the capacity of LRUCache, if it exceeds, delete the tail Points and corresponding hash table mappings. If so, update the value and add it to the head of the linked list.

void put(int key, int value) {<!-- -->
    if (map.find(key) == map.end()) {<!-- -->
        DoubleList *node = new DoubleList(key, value);
        //map[key] = node;
        map.insert(pair<int, DoubleList*>(key, node));
        addHead(node);
        size_++;
        if (size_ > capacity_) {<!-- -->
            DoubleList *tmp = tail->prev;
            removeNode(tmp);
            map.erase(tmp->key); //The hash table needs to be deleted, so there is also a key in the node
            delete(tmp);
            size_--;
        }
    }
    else {<!-- -->
        DoubleList *node = map[key];
        node->value = value;
        moveToHead(node);
    }
}

get()
If the element exists, return the value and put the element into the head node of the linked list

 int get(int key) {<!-- -->
     if (map.find(key) == map.end()) {<!-- -->
         return -1;
     }
     else {<!-- -->
         DoubleList *node = map[key];
         moveToHead(node);
         return node->value;
     }
 }

addHead(), removeNode(), moveToHead() and other operations related to the double-linked list will not be described in detail…
Here, the functions related to the operation of the linked list are written in the LRUCache class. In fact, a node class and a doubly linked list class should be defined, and the related operations of the linked list are encapsulated in the linked list class. Later, LFU will write more standardized.

LFU

Title

Data structure

Features:

  • The FreqDoubleList linked list stores nodes sorted by the number of visits from small to large
  • In the DoubleList is the real node for storing data
  • Each freqNode node in FreqDoubleList has its own DoubleList
  • freq_map stores the mapping between the accessed key and the node freqNode that stores the number of visits to the key;
  • lru_map is the same as the map in the aforementioned LRU algorithm, the difference is that it can be mapped to nodes of multiple linked lists. There is an LRU double-linked list under each freqNode, because if the number of visits is the same, further decisions must be made based on the LRU algorithm.

Concrete implementation

Definition of node class and linked list class

  • Definition of LRUDoubleList linked list and its Node:

Implement the interface of inserting and deleting nodes in the linked list class, which is similar to the previous one. The difference is that isempty(), this function means to judge whether the LRUDoubleList has no data stored, because the linked list belongs to freqNode, if there is no data with access times equal to freqNode->fre_ after the operation, Then ferqNode needs to be deleted, so it is necessary to judge whether the linked list is empty, and then delete the useless freqNode node.

class Node {<!-- -->
public:
    int key_;
    int value_;
    Node *prev;
    Node *next;
    Node(): key_(0), value_(0), prev(nullptr), next(nullptr){<!-- -->}
    Node(int key, int value): key_(key), value_(value), prev(nullptr), next(nullptr){<!-- -->}
};

class LRUDoubleList {<!-- -->
public:
    Node *head;
    Node *tail;
    LRUDoubleList(): head(new Node()), tail(new Node()) {<!-- -->
        head->next = tail;
        tail->prev = head;
    }

    void addHead(Node *node) {<!-- -->
        node->next = head->next;
        node->prev = head;
        head->next->prev = node;
        head->next = node;
    }

    Node* delNode(Node *node){<!-- -->
        node->next->prev = node->prev;
        node->prev->next = node->next;
        return node;
    }

    Node* delLast() {<!-- -->
        return delNode(tail->prev);
    }

    bool isempty() {<!-- -->
        return head->next == tail;
    }

};
  • Definition of FreDoubleList linked list and its freqNode:

The important place is in FreDoubleListhead and tail node assignment and addCurFreq() function

class FreqNode {<!-- -->
public:
  int freq_;
  FreqNode *prev;
  FreqNode *next;
  //****************A freqNode node corresponds to an LRU linked list******************
  LRUDoubleList *dl;
  
  //explicit prevents implicit conversion, in fact, it just wants to be a little more standardized, so it is written like this
  explicit FreqNode(int freq):freq_(freq), prev(nullptr), next(nullptr), dl(new LRUDoubleList()){<!-- -->}
  
};

class FreDoubleList {<!-- -->
public:
  FreqNode *head;
  FreqNode *tail;
  
  // When initializing freqDoubleList, the head and tail nodes can be set to 0 and INT_MAX
  FreDoubleList(): head(new FreqNode(0)), tail(new FreqNode(INT_MAX)) {<!-- -->
      head->next = tail;
      tail->prev = head;
  }

  //Add one to the current frequency
  void addCurFreq(FreqNode *cur) {<!-- -->
      //if not exist, create new freq;
      if (cur->freq_ + 1 != cur->next->freq_) {<!-- -->
          FreqNode *newFreqNode = new FreqNode(cur->freq_ + 1);

          newFreqNode->next = cur->next;
          newFreqNode->prev = cur;
          cur->next->prev = newFreqNode;
          cur->next = newFreqNode;
      }
  }

  void delFreqNode(FreqNode *node) {<!-- -->
      node->next->prev = node->prev;
      node->prev->next = node->next;
  }
};

LFUCache class

Member variable definition

The two hash tables required above; the linked list indicating the number of visits, and the LRU linked list under it (created when freqNode is created); the capacity to store data

private:
    //int size_;
    int capacity_;
    unordered_map<int, FreqNode*> fre_map;
    unordered_map<int, Node*> lru_map;
    FreDoubleList *fl;

Interface implementation

get()
If not found, return -1 directly;
If found, first delete the original node, and then update the data structure

int get(int key) {<!-- -->
//If the data is not found, return -1 directly
    if (lru_map.find(key) == lru_map.end()) {<!-- -->
        return -1;
    }
 //The data exists, perform two steps: 1. Delete the original; 2. Update the data structure
    //1, delete old
    //The number of visits + 1, so first increase the corresponding freqNode node
    FreqNode *before = fre_map[key];
    fl->addCurFreq(before);
    
    //Visited, so fre + 1, that is, before->next is the newly added ferqNode node
    FreqNode* after = before->next;
    Node *targetNode = lru_map[key];
    //Delete the nodes in the LRU linked list under the original freqNode
    before->dl->delNode(targetNode);
    //Judge whether the LRU linked list still has elements, if not, explain the freqNode node
    if (before->dl->isempty()) {<!-- -->
        fl->delFreqNode(before);
    }
    //2, update to the next freq node
    after->dl->addHead(targetNode); //Update the nodes in the LRU linked list under frqNode
    fre_map[key] = after; //The number of visits has changed, and the mapping needs to be updated
    return targetNode->value_;
   
}

put()

void put(int key, int value) {<!-- -->
    if (capacity_ == 0) return;
    if (lru_map.find(key) != lru_map.end()) {<!-- -->
        lru_map[key]->value_ = value;
        get(key);
    }
    else {<!-- --> //If there is no element, you need to add it
        if(lru_map. size() == capacity_) {<!-- -->
            //if will out of size, shoule delete node(Last LruNode in first LRUDoubledList)
            FreqNode *node = fl->head->next;
            Node *nodeToDel = node->dl->delLast();
            //delete map
            lru_map.erase(nodeToDel->key_);
            fre_map.erase(nodeToDel->key_);
            //The minimum frequency column has no elements, delete ....
            if (node->dl->isempty()) {<!-- -->
                fl->delFreqNode(node);
            }
        }
    
        //add node must update fl->head
        fl->addCurFreq(f1->head);
        Node *newNode = new Node(key, value);
        fre_map[key] = fl->head->next;
        lru_map[key] = newNode;
        //************Pay attention to the structure nesting here****************
        fl->head->next->dl->addHead(newNode);
    }
}

Summary

The writing of LFU is more standardized, and the linked list and its nodes are defined separately, and the structure is clearer. I am writing LRU and LFU all day today. . . slipped away to write the fund