Memory-based distributed NoSQL database Redis (5) Data storage and RDB design

Article directory

    • Knowledge point 18: Data storage design
    • Knowledge point 19: Redis persistence: RDB design
    • Knowledge point 20: Redis persistence: RDB testing
    • postscript

Knowledge point 18: Data storage design

  • Goal: Master the design of common data stores

  • Implementation

    • Question
      • How does data storage ensure data security?
      • How to ensure the security of HDFS data?
      • How to ensure the security of HDFS metadata?
      • How to ensure the security of Spark’s RDD data?
    • Solution
      • Disk Storage: Data is stored on a hard drive
        • Features: large capacity, high security, relatively slower reading and writing speed than memory
        • Solution: Copy backup
      • Memory Storage: Data is stored in memory
        • Features: small capacity, low security, high reading and writing performance
        • Solution: copy, persist to disk
      • How to ensure the security of HDFS data?
        • Disk: copy mechanism
      • How to ensure the security of HDFS metadata?
        • Disk: fsimage + edits
          • Copy mechanism: fsimage can be configured to be stored in multiple directories, with one copy stored in each directory
        • Memory: Loaded into memory at startup, read and write in memory
          • edits: operation log, NameNode will record changes in metadata in memory in the edits file
      • How to ensure the security of Spark’s RDD data?
        • Method 1: Lineage mechanism: each RDD saves the dependency relationship with its parent RDD
        • Method 2: persist/unpersist: cache, cache the RDD in memory or disk, and the cache has a copy mechanism
        • Method three: checkpoint: checkpoint persistence, persisting RDD data in the disk [HDFS]
  • Summary

  • Master the design of common data storage

Knowledge point 19: Redis persistence: RDB design

  • Goal: Master the RDB persistence mechanism of Redis

  • Path

    • step1: problem
    • step2: RDB solution
    • step3: Advantages and Disadvantages
  • Implementation

    • Question

      The data in Redis is stored in the memory, and the memory provides external reading and writing. Once Redis is restarted, the data in the memory will be lost. How does Redis achieve persistence?
      
      • Write: set/hset/lpush/sadd/zadd
      • Write to memory and return directly
      • Read: get/hget/lrange/smembers/zrange
        • Read memory directly
      • Every time Redis writes to memory, the data is synchronized to disk
      • If restarted, the data in the disk will be reloaded to the disk to provide reading
    • RDB Solution

      • Redis default persistence solution

      • Ideas

        • According to a certain period of time, if the data in the Redis memory produces a certain number of updates, all the data in the entire Redis memory will be captured A full snapshot file is stored on the hard disk
        • The new snapshot will overwrite the old snapshot file. The snapshot is a full snapshot, including all the contents in the memory, which is basically consistent with the memory.
        • If Redis fails and restarts, recover from the snapshot file of the hard disk.
      • Examples

        • Configuration: save 30 2
        • Explanation: If 2 updates [insertion, deletion, modification] occur in the data in redis memory within 30 seconds, the entire Redis memory data will be saved to the disk file as a snapshot.
      • Process

        image-20210521162946231

      • Trigger

        • Manual trigger: When certain commands are executed, a snapshot will be taken automatically [generally not used]

          • save: Manually trigger RDB snapshot, and take the latest snapshot of all the data in the memory.
            • Front-end operation
            • Block all client requests and wait for the snapshot to be taken before continuing to process client requests.
            • Features: Snapshots are consistent with memory, data will not be lost, and user requests will be blocked
          • bgsave: Manually trigger RDB snapshot, and take the latest snapshot of all the data in the memory
            • Running in the background
            • The main process will fork a sub-process responsible for taking snapshots, and the client can request normally without being blocked.
            • Features: The user requests to continue execution, and the user’s newly updated data is not in the snapshot.
          • shutdown: Execute the command to shut down the server
          • flushall: clear, meaningless
        • Automatically triggered: Take snapshots based on the number of updates that occur within a certain period of time

          • There are corresponding configurations in the configuration file to determine when to take snapshots

            #Redis can set multiple sets of rdb conditions. Three sets are set by default. These three sets work together and will take a snapshot if any one is met.
            save 900 1
            save 300 10
            save 60 10000
            
            • Why is the default setting 3 groups?
    • Reason: If there is only one set of policies for different write scenarios, data loss will occur.
      – Set different strategies for different reading and writing speeds, and perform cross-save snapshots to meet data storage strategies under various circumstances.

    • Advantages and Disadvantages

      • advantage
        • The rdb method implements full snapshot. The data in the snapshot file is consistent with the data in the memory.
        • Snapshots are binary files. It is faster to generate and load snapshots and smaller in size.
        • Fork process implementation, better performance
        • Bottom Line: Faster, Smaller, Better Performance
      • shortcoming
        • There is a certain probability that some data will be lost
    • Application: I hope to have a high-performance read and write that does not affect the business and allows a certain probability of loss of some data** [caching]**, large-scale data backup and recovery

  • Summary

    • What is the RDB mechanism and what are its advantages and disadvantages?

      • Idea: If Redis is updated a certain number of times within a certain period of time, take a full snapshot binary file and store it on the disk.
    • If restarted, load the binary file directly and restore it to the memory.

      • trigger
        • Manual: bgsave, shutdown
        • Automatic: save time times
      • Features
        • Advantages: smaller, faster, full capacity, better performance
        • Disadvantages: There is a certain probability of data loss
      • Scenario: Large-scale data caching or data backup and recovery

Knowledge point 20: Redis persistence: RDB testing

  • Goal: Testing RDB persistence

  • Implementation

    • View current snapshot

      ll /export/server/redis/datas/
      

      image-20210522101037218

    • Configuration modification

      cd /export/server/redis
      vim redis.conf
      #202 line
      save 900 1
      save 300 10
      save 60 10000
      save 20 2
      
    • Restart the redis service and the configuration will take effect.

      shutdown
      redis-start.sh
      
  • Insert data

     set s1 "laoda"
      set s2 "laoliu"
      set s3 "laoliu"
    
  • View dumped rdb snapshot

     ll /export/server/redis/datas/
    

    image-20210522101331161

  • Summary

    • Testing to implement RDB persistence

Postscript

Blog homepage: https://manor.blog.csdn.net

Welcome to like Collect ?Leave a message Please correct me if there are any errors!
This article was originally written by Maynor and first appeared on the CSDN blog
You can’t stare at the phone screen all the time. You should raise your head from time to time to see where the boss is?
The column is continuously updated, welcome to subscribe: https://blog.csdn.net/xianyu120/category_12394313.html