SkipList_DB
Table of Contents
- 1. Source code analysis
- 2. Node node class
- member attribute
- member function
- 3. SkipList jump list class
- member attribute
- member method
- 1. Constructor and destructor
- 2. get_random_value
- 3. create_node
- 4. Insert data insert_element
- 5. Display data display_list
- 6. Find data search_element
- 7. Delete data delete_element
- 8. Data dump dump_file
- 9. Load data load_file
- 10. Return data size size
- 11. Data preprocessing operations
1. Source code analysis
core code file
SkipList.h // Implementation file of node class and skip list class
core function
insert_element // insert element delete_element // delete element search_element // find element
2. Node node class
Member attributes
// private member K key // key V value // value // public members Node<K, V> **forward; // point to the next node in a different layer in the linear table int node_level; // which node level the current node is in
Member function
No-argument constructor and destructor
// no parameter construction Node() {} // destructor ~Node(); template<typename K, typename V> Node<K, V>::~Node() { // Do some memory release operations delete []forward; };
Constructor with parameters
Node(K k, V v, int); template<typename K, typename V> Node<K, V>::Node(const K k, const V v, int level) { this->key = k; this->value = v; this->node_level = level; // level + 1, the index is from 0 - level this->forward = new Node<K, V>*[level + 1]; // Fill forward array with 0(NULL) memset(this->forward, 0, sizeof(Node<K, V>*)*(level + 1)); };
Question 1: Why is there a difference in the parameter list between function declaration and implementation?
When function is declared, the formal parameter may not have a parameter name, but when function is defined, the formal parameter must have a parameter name.
Available in both C and C++, but both are not recommended to omit
Question 2: Should the initialization operation of forward here be clearer?
// Create a memory space of level + 1 size, in which the address of Node<K,V> type is to be placed this->forward = new Node<K,V>*[level + 1]; // The memset function initializes this memory // The sizeof function first obtains the size of a Node<K,V>* type memory block, and then multiplies it by level + 1 memset(this->forward, 0, sizeof(Node<K, V>*)*(level + 1));
Get method of key and value
K get_key() const; template<typename K, typename V> K Node<K, V>::get_key() const { return key; }; V get_value() const; template<typename K, typename V> V Node<K, V>::get_value() const { return value; };
value set method
void set_value(V); template<typename K, typename V> void Node<K, V>::set_value(V value) { this->value=value; };
3. SkipList jump list class
Member attributes
// all are private attributes // The maximum number of layers of the skip table int _max_level; // The current level of the skip table int _skip_list_level; // The head node pointer of the jump list Node<K, V> *_header; // file operations std::ofstream_file_writer; std::ifstream_file_reader; // Skip the current number of elements in the list int _element_count;
Question 5: File operations, ofstream and ifstream
Member method
1. Constructor and destructor
Constructor
SkipList(int); // Constructor template<typename K, typename V> SkipList<K, V>::SkipList(int max_level) { this->_max_level = max_level; this->_skip_list_level = 0; this->_element_count = 0; // create header node and initialize key and value to null K k; V v; this->_header = new Node<K, V>(k, v, _max_level); };
Initialize the jump table, set information such as the maximum height and the number of current nodes, create a head node, the key value of the head node is empty, but the level is set to the maximum level (here because the virtual head node should exist in Each layer).
Destructor
~SkipList(); template<typename K, typename V> SkipList<K, V>::~SkipList() { if (_file_writer. is_open()) { _file_writer. close(); } if (_file_reader. is_open()) { _file_reader. close(); } delete_header; }
Perform some resource recovery operations, close file descriptors, and delete virtual head nodes.
2. get_random_value
Function implementation
template<typename K, typename V> int SkipList<K, V>::get_random_level(){ int k = 1; while (rand() % 2) { k++; } k = (k < _max_level) ? k : _max_level; return k; };
The rand() function in C++ will return an arbitrary integer from 0 to the maximum random number.
In this way, the probability that k ++ can be executed in each cycle is 1/2, then this function can be realized
- The probability of returning 1 is 1/2
- The probability of returning 2 is 1/4
- The probability of returning 3 is 1/8, and so on
3. create_node
Create a new node according to the incoming k,v,level
// create new node template<typename K, typename V> Node<K, V>* SkipList<K, V>::create_node(const K k, const V v, int level) { Node<K, V> *n = new Node<K, V>(k, v, level); return n; }
4. Insert data insert_element
Return value
- 1 indicates that the current node already exists
- 0 indicates successful insertion
Process
- Initialize the update array, which is used to store the front node of the node to be inserted
- Starting from the highest level of the jump table, find the preceding node corresponding to the node to be inserted in each layer, and store it in the corresponding update[i]
- Returns 1 if there is a node equal to the value currently to be inserted
- Call get_random_level to get the level of the current node
- If random_level > current level, it means that the number of layers needs to be increased, and then the new node should be placed after the _header in the newly added number of layers, and the number of layers in the current skip table should be updated
- Create a new node and insert the current node after update[i] in each layer
template<typename K, typename V> int SkipList<K, V>::insert_element(const K key, const V value) { mtx.lock(); // Mutex locks are required for operations on shared resources Node<K, V> *current = this->_header; // Create an update array and initialize it, // The update array represents the nodes that need to be updated in each layer during this insertion process Node<K, V> *update[_max_level + 1]; memset(update, 0, sizeof(Node<K, V>*)*(_max_level + 1)); // Operate from the highest level in the jump list for(int i = _skip_list_level; i >= 0; i--) { while(current->forward[i] != NULL & amp; & amp; current->forward[i]->get_key() < key) { current = current->forward[i]; } update[i] = current; } // reached level 0 and forward pointer to right node, which is desired to insert key. // Reach level 0, the actual linked list level current = current->forward[0]; // If there is a node equal to the current value, it means that it cannot be inserted, and returns 1 if (current != NULL & amp; & amp; current->get_key() == key) { std::cout << "key: " << key << ", exists" << std::endl; mtx. unlock(); return 1; } // If current == NULL, it means we have reached the end of this layer // if current's key is not equal to key means we need to insert data between update[0] and current node if (current == NULL || current->get_key() != key ) { // Randomly obtain the current node, which levels should be inserted int random_level = get_random_level(); // If the randomly returned layer is greater than the current layer number of the jump table, then the number to be inserted should be placed after the _header in the layer larger than the current layer number if (random_level > _skip_list_level) { for (int i = _skip_list_level + 1; i < random_level + 1; i ++ ) { update[i] = _header; } _skip_list_level = random_level; } // Create a new node with key, value, random_level Node<K, V>* inserted_node = create_node(key, value, random_level); // Insert nodes at each level for (int i = 0; i <= random_level; i ++ ) { inserted_node->forward[i] = update[i]->forward[i]; update[i]->forward[i] = inserted_node; } std::cout << "Successfully inserted key:" << key << ", value:" << value << std::endl; _element_count++; } mtx. unlock(); return 0; }
5. Display data display_list
Print formatted information, each layer of information has
// Display skip list // traverse layer by layer, print output template<typename K, typename V> void SkipList<K, V>::display_list() { std::cout << "\ *****Skip List*****"<<"\ "; // loop through each layer for (int i = 0; i <= _skip_list_level; i ++ ) { Node<K, V> *node = this->_header->forward[i]; std::cout << "Level " << i << ": "; while (node != NULL) { std::cout << node->get_key() << ":" << node->get_value() << ";"; node = node->forward[i]; } std::cout << std::endl; } }
6. Find data search_element
Process: Relatively simple, because corresponding search operations are also performed when adding and deleting
Finally, it is necessary to judge whether the corresponding key is found
Question: Why is the and operator symbol used here
Answer: and, or, not are keywords in c++, but not c, and, or, not keywords in c++ are & amp; & amp;, || ,! The spare tire, this is because some machines did not support ios646 encoding
template<typename K, typename V> bool SkipList<K, V>::search_element(K key) { std::cout << "search_element-----------------" << std::endl; Node<K, V> *current = _header; // Start looking from the top level of the jump list for (int i = _skip_list_level; i >= 0; i--) { while (current->forward[i] & amp; & amp; current->forward[i]->get_key() < key) { current = current->forward[i]; } } // When reaching level 0, take out current->forward[0] current = current->forward[0]; // If it is not empty and the key is equal to the key we want, it is found if (current and current->get_key() == key) { std::cout << "Found key: " << key << ", value: " << current->get_value() << std::endl; return true; } std::cout << "Not Found Key:" << key << std::endl; return false; }
7. Delete data delete_element
Process
- Initialize the update array to store the front node of the node to be deleted
- Find the node to be deleted, if not found in the code, no operation will be performed by default
- Starting from level 0, if the post node of update[i] is equal to the node to be deleted, then the post node of update[i] points to the post node of the node to be deleted
- Delete the layer without nodes, that is, the current number of layers is reduced
// Delete element from skip list // delete from skip list template<typename K, typename V> void SkipList<K, V>::delete_element(K key) { mtx. lock(); Node<K, V> *current = this->_header; // Here we still need to find the predecessor node of the node to be deleted Node<K, V> *update[_max_level + 1]; memset(update, 0, sizeof(Node<K, V>*)*(_max_level + 1)); // start from highest level of skip list for (int i = _skip_list_level; i >= 0; i--) { while (current->forward[i] !=NULL & amp; & amp; current->forward[i]->get_key() < key) { current = current->forward[i]; } update[i] = current; } current = current->forward[0]; if (current != NULL & amp; & amp; current->get_key() == key) { // Starting from level 0, deleting the point to be deleted in each layer is actually changing the relationship of pointers for (int i = 0; i <= _skip_list_level; i ++ ) { // If starting from level i, the successor of update[i] is not the node to be deleted, just jump out if (update[i]->forward[i] != current) break; update[i]->forward[i] = current->forward[i]; } // Delete the layer without nodes, in fact, the current number of layers can be reduced // This can be compared with 0 because the memset function is all initialized to 0 during initialization while (_skip_list_level > 0 & amp; & amp; _header->forward[_skip_list_level] == 0) { _skip_list_level--; } std::cout << "Successfully deleted key "<< key << std::endl; _element_count --; } mtx. unlock(); return; }
8. Data dump_file
// Dump data in memory to file template<typename K, typename V> void SkipList<K, V>::dump_file() { std::cout << "dump_file-----------------" << std::endl; // open a file _file_writer.open(STORE_FILE); Node<K, V> *node = this->_header->forward[0]; while (node != NULL) { // Write to the file while printing on the console, only operate level 0 _file_writer << node->get_key() << ":" << node->get_value() << "\ "; std::cout << node->get_key() << ":" << node->get_value() << ";\ "; node = node->forward[0]; } _file_writer. flush(); _file_writer. close(); return; }
9. Load data load_file
// Load data from disk // load data from disk file template<typename K, typename V> void SkipList<K, V>::load_file() { _file_reader.open(STORE_FILE); std::cout << "load_file-----------------" << std::endl; std::string line; std::string* key = new std::string(); std::string* value = new std::string(); while (getline(_file_reader, line)) { // Parse each row of data, each row of data is a key-value pair // The line here is the incoming parameter, key and value are the outgoing parameters get_key_value_from_string(line, key, value); if (key->empty() || value->empty()) { continue; } // insert each data insert_element(*key, *value); std::cout << "key:" << *key << "value:" << *value << std::endl; } _file_reader. close(); }
10. Return data size size
// Get current SkipList size template<typename K, typename V> int SkipList<K, V>::size() { return_element_count; }
11. Data preprocessing operations
Separate key and value from string
template<typename K, typename V> void SkipList<K, V>::get_key_value_from_string(const std::string & str, std::string* key, std::string* value) { if(!is_valid_string(str)) { return; } *key = str.substr(0, str.find(delimiter)); *value = str.substr(str.find(delimiter) + 1, str.length()); }
Judging whether the string is legal
template<typename K, typename V> bool SkipList<K, V>::is_valid_string(const std::string & amp; str) { if (str.empty()) { return false; } if (str. find(delimiter) == std::string::npos) { return false; } return true; }
If the find() function
in string does not find a matching character, it will return std::string::nops
, std::string::npos
represents a constant of type size_type
, whose value is equal to the maximum value that type size_type
can represent.