[MySQL Indexing and Optimization] Database design practice (including ER model)

Database design practice (including ER model)

Article directory

  • Database design practice (including ER model)
    • 1. ER model
      • 1.1 Overview
      • 1.2 Modeling analysis
      • 1.3 Refinement of ER model
      • 1.4 Convert ER model diagram into data table
        • 1. Convert an entity into a database table
        • 2. Convert a many-to-many relationship into a data table
        • 3. Express a 1-to-many relationship through foreign keys
        • 4. Convert attributes into table fields
      • 1.5 Summary
    • 2. Design principles of database tables
    • 3. Suggestions on writing database objects
      • 3.1 About the library
      • 3.2 About tables and columns
      • 3.3 About index
      • 3.4 SQL writing

1. ER model

1.1 Overview

There are three elements in the ER model, namely entities, attributes and relationships

Entity can be regarded as a data object, which often corresponds to real individuals in real life. In the ER model, it is represented by a rectangle. Entities are divided into two categories, namely strong entities and weak entities. A strong entity refers to an entity that does not depend on other entities; a weak entity refers to an entity that has a strong dependence on another entity.

Attributes refers to the characteristics of the entity. For example, the supermarket’s address, contact number, number of employees, etc. Represented by ellipse in the ER model.

Relationship refers to the connection between entities. For example, when a supermarket sells goods to customers, it is a connection between the supermarket and the customer. Represented by diamond in the ER model. Relationships can be divided into 3 types, namely one-to-one, one-to-many, and many-to-many.

1.2 Modeling Analysis

The ER model seems cumbersome, but it is very important for us to control the overall project. If you are just developing a small application, perhaps simply designing a few tables is enough. Once you want to design an application of a certain scale, it is very critical to establish a complete ER model in the initial stage of the project. The essence of developing application projects is actually Modeling.

Here we take the design of e-commerce business as an example. Since the e-commerce business is too large and complex, we have simplified the business, such as targeting SKU (StockKeepingUnit) and SPU (Standard Product Unit). In the meaning of standardized product unit), we directly used SKU and did not mention the concept of SPU. This e-commerce business design has a total of 8 entities, as shown below.

  • address entity
  • user entity
  • shopping cart entity
  • Comment entity
  • Commodity entity
  • Product classification entity
  • order entity
  • Order details entity

Among them, User and Product Category are strong entities because they do not need to depend on any other entities. Others are weak entities, because although they can exist independently, they all depend on the user entity, so they are all weak entities. Knowing these elements, we can start from the strong entity User to create an ER model for the e-commerce business, as shown in the figure:

1.3 Refinement of ER model

With this ER model, we can understand the e-commerce business as a whole. The ER model just now shows the framework of the e-commerce business, but it only includes eight entities: order, address, user, shopping cart, comment, product, product category and order details, and the relationship between them, which cannot be mapped yet. Specific tables and the relationships between tables. We need to add the attribute and use ellipse to represent it, so that the ER model we get will be more complete.

Therefore, we need to further design each part of this ER model, that is, refine the specific business processes of e-commerce, and then integrate them together to form a complete ER model. This can help us clarify the design ideas of the database.

Next, let’s analyze what attributes each entity has, as shown below:

  1. Address entity includes user number, province, city, region, recipient, contact number, and whether it is the default address
  2. User entity includes user number, user name, nickname, user password, mobile phone number, email, avatar, and user level
  3. Shopping cart entity includes shopping cart number, user number, product number, product quantity, and image file URL
  4. Order entity includes order number, consignee, recipient phone number, total amount, user number, payment method, shipping address, order time
  5. Order details entity includes order details number, order number, product name, product number, product quantity
  6. Product entity includes product number, price, product name, category number, whether it is for sale, specifications, and color
  7. Comment entity includes comment id, comment content, comment time, user number, and product number
  8. Product classification entity includes category number, category name, parent category number

After this subdivision, we can redesign the e-commerce business. The ER model is as shown in the figure:

1.4 Convert ER model diagram into data table

By drawing the ER model, we have clarified the business logic. Now, we are about to take a very important step: convert the drawn ER model into a specific data table. The principles of conversion are introduced below:

  1. An entity is usually converted into a data table;
  2. A many-to-many relationship is usually converted into a data table;
  3. A 1 to 1 or 1 to many relationship is often expressed through the foreign key of the table instead of designing a new data table. ;
  4. Attributes are converted into table fields
1. Convert an entity into a database table

First look at the conversion of strong entities into data tables

User entity is converted into user table

CREATE TABLE
    user_info
    (
        id bigint UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'number',
        user_name VARCHAR(200) COMMENT 'user name',
        nick_name VARCHAR(200) COMMENT 'User Nickname',
        password VARCHAR(200) COMMENT 'User password',
        phone_num VARCHAR(20) COMMENT 'Mobile phone number',
        email VARCHAR(200) COMMENT 'Email',
        head_img VARCHAR(200) COMMENT 'avatar',
        user_level VARCHAR(200) COMMENT 'user level',
        PRIMARY KEY (id)
    )
    ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 DEFAULT COLLATE=utf8mb4_general_ci COMMENT='User table';

Product classification entity is converted into a product classification table (base_category). Since product classification can have first-level classification and second-level classification, for example, the first-level classification includes home furnishings, mobile phones, etc., and the second-level classification can be based on the mobile phone. The first-level classification is divided into mobile phone accessories, operators, etc. Here we plan the product classification entity into two tables, namely First-level classification table and Second-level classification table. The reason for this planning is that both the first-level classification and the second-level classification are limited, and the business structure is clearer when stored in two tables.

CREATE TABLE
    base_category1
    (
        id bigint UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'number',
        name VARCHAR(100) NOT NULL COMMENT 'category name',
        PRIMARY KEY (id)
    )
    ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 DEFAULT COLLATE=utf8mb4_general_ci COMMENT='First-level classification table';
CREATE TABLE
    base_category2
    (
        id bigint UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'number',
        name VARCHAR(100) NOT NULL COMMENT 'Second-level classification name',
        category1_id bigint COMMENT 'First-level classification number',
        PRIMARY KEY (id)
    )
    ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 DEFAULT COLLATE=utf8mb4_general_ci COMMENT='Second-level classification table';

Then convert weak entity into data table

Address entity is converted into an address table (user_address)

CREATE TABLE
    user_address
    (
        id bigint UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'number',
        province VARCHAR(255) COMMENT 'province',
        city VARCHAR(255) COMMENT 'city',
        user_address VARCHAR(255) COMMENT 'Detailed address',
        user_id bigint UNSIGNED COMMENT 'userid',
        consignee VARCHAR(50) COMMENT 'recipient',
        telephone VARCHAR(20) COMMENT 'Contact information',
        default_mark TINYINT COMMENT 'Is it default (1: yes 0: no)',
        PRIMARY KEY (id)
    )
    ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 DEFAULT COLLATE=utf8mb4_general_ci;

Order entity is converted into order table (order_info)

CREATE TABLE
    order_info
    (
        id bigint UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'number',
        consignee VARCHAR(50) COMMENT 'recipient',
        consignee_tel VARCHAR(20) COMMENT 'Contact information',
        total_amount DECIMAL(10,2) COMMENT 'total amount',
        user_id bigint UNSIGNED COMMENT 'userid',
        payment_way VARCHAR(20) COMMENT 'payment method',
        delivery_address VARCHAR(255) COMMENT 'delivery address',
        create_time DATETIME COMMENT 'order time',
        PRIMARY KEY (id)
    )
    ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 DEFAULT COLLATE=utf8mb4_general_ci COMMENT='Order Table';

Order details entity is converted into order details table (order_detail), as shown below. (Used to reflect many-to-many relationships, see the next section)

CREATE TABLE
    order_detail
    (
        id bigint UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'Order details number',
        order_id bigint UNSIGNED COMMENT 'order number',
        sku_id bigint COMMENT 'sku_id',
        sku_name VARCHAR(200) COMMENT 'sku name',
        sku_num INT COMMENT 'Purchase quantity',
        create_time DATETIME COMMENT 'operation time',
        PRIMARY KEY (id)
    )
    ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 DEFAULT COLLATE=utf8mb4_general_ci COMMENT='Order Details';

Shopping cart entity is converted into a shopping cart table (cart_info)

CREATE TABLE
    cart_info
    (
        id bigint UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'number',
        user_id bigint UNSIGNED NOT NULL COMMENT 'userid',
        sku_id bigint UNSIGNED COMMENT 'skuid',
        sku_num INT COMMENT 'Quantity',
        img_url VARCHAR(255) COMMENT 'Image link',
        PRIMARY KEY (id)
    )
    ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 DEFAULT COLLATE=utf8mb4_general_ci COMMENT='Shopping cart table';

Comment entity is converted into a comment table (sku_comment)

CREATE TABLE
    sku_comment
    (
        id bigint UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'number',
        user_id bigint UNSIGNED NOT NULL COMMENT 'user number',
        sku_id bigint UNSIGNED NOT NULL COMMENT 'item number',
        COMMENT VARCHAR(255) COMMENT 'Comment content',
        create_time DATETIME COMMENT 'Comment time',
        PRIMARY KEY (id)
    )
    ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 DEFAULT COLLATE=utf8mb4_general_ci COMMENT='Product review form';

Commodity entity is converted into a product table (sku_info)

CREATE TABLE
    sku_info
    (
        id bigint UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'item number',
        price DECIMAL(10,2) COMMENT 'price',
        sku_name VARCHAR(255) COMMENT 'product name',
        sku_desc VARCHAR(2000) COMMENT 'Product Description',
        category3_id bigint UNSIGNED COMMENT 'Third-level classification id (redundant)',
        color VARCHAR(50) COMMENT 'color',
        sale_mark TINYINT COMMENT 'Is it for sale (1: yes 0: no)',
        PRIMARY KEY (id)
    )
    ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 DEFAULT COLLATE=utf8mb4_general_ci;
2. Convert a many-to-many relationship into a data table

There is one many-to-many relationship in this ER model, that is, the relationship between product and order. Products of the same category can appear in different orders. Different Orders can also contain items of the same type, so the relationship between them is many-to-many. For this situation, it is necessary to design an independent table to represent it. This kind of table is generally called intermediate table.

We can design an independent Order Details Table to represent the inclusion relationship between products and orders. This table is associated with 2 entities, namely orders and products. Therefore, the table must include the primary key of the table converted from these two entities. In addition, we also need to include the relationship’s own attributes, product quantity, product order price, and product name. That is, the order details table order_detail that has been created above, look at it again here

CREATE TABLE
    order_detail
    (
        id bigint UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'Order details number',
        order_id bigint UNSIGNED COMMENT 'order number',
        sku_id bigint COMMENT 'sku_id',
        sku_name VARCHAR(200) COMMENT 'sku name',
        sku_num INT COMMENT 'Purchase quantity',
        create_time DATETIME COMMENT 'operation time',
        PRIMARY KEY (id)
    )
    ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 DEFAULT COLLATE=utf8mb4_general_ci COMMENT='Order Details';
3. Express one-to-many relationships through foreign keys

For example, in the product review table, user_id and sku_id are defined as foreign keys. Note that foreign key constraints are generally not set at the database level, because foreign keys will cause performance degradation and have a negative impact on performance. Instead, the application layer performs consistency check on the data to make the data satisfy the foreign key function.

4. Convert attributes into table fields

1.5 Summary

Any database-based application project can complete the database design work by creating an ER model first and then converting it into a data table. Creating an ER model is not the purpose. The purpose is to sort out the business logic and design an excellent database. Don’t model for the sake of modeling, use the process of creating an ER model to organize your thoughts so that creating an ER model makes sense.

2. Design principles of database tables

  1. The fewer the number of data tables, the better
  2. The fewer fields in the data table, the better
  3. The fewer the number of joint primary key fields in the data table, the better.
  4. The more primary and foreign keys you use, the better

Note: This principle is not absolute. Sometimes we need to sacrifice data redundancy in exchange for data processing efficiency.

3. Suggestions on writing database objects

3.1 About the library

  1. [Mandatory] The name of the library must be controlled within 32 characters. Only English letters, numbers and underscores can be used. It is recommended to start with an English letter.
  2. [Mandatory] Library names in Chinese and English must be all lowercase, and different words should be separated by underscore. You must see the name and know the meaning.
  3. [Mandatory] The name format of the library: business system name_subsystem name.
  4. [Mandatory] It is forbidden to use keywords (such as type, order, etc.) in the library name.
  5. [Mandatory] You must explicitly specify the character set when creating a database, and the character set can only be utf8 or utf8mb4.
    SQL example for creating a database: CREATE DATABASE crm_fund DEFAULT CHARACTER SET utf8’;
  6. [Recommendation] For programs to connect to database accounts, follow the principle of least privilege
    The database account can only be used under one DB and is not allowed to cross databases. In principle, the account used by the program is not allowed to have drop permissions.
  7. [Recommendation] The temporary library should be prefixed with tmp_ and suffixed with date;
    Backup repositories are prefixed by bak_ and suffixed by date.

3.2 About tables and columns

  1. [Mandatory] Table and column names must be controlled within 32 characters. Table names can only use English letters, numbers and underscores. It is recommended to start with English letters.
  2. [Mandatory] Table names and column names must be lowercase, and different words should be separated by underscores. You must see the name and know the meaning.
  3. [Mandatory] The table name must be strongly related to the module name. Table names in the same module should use a unified prefix as much as possible. For example: crm_fund_item
  4. [Mandatory] When creating a table, you must explicitly specify the character set as utf8 or utf8mb4.
  5. [Mandatory] Keywords (such as type, order, etc.) are prohibited in table names and column names.
  6. [Mandatory] The table storage engine type must be explicitly specified when creating a table. If there are no special requirements, it will always be InnoDB.
  7. [Mandatory] Comments are required when creating a table.
  8. [Mandatory] Field naming should use English words or abbreviations that express the actual meaning whenever possible. For example: company ID, do not use corporation_id, just use corp_id.
  9. [Mandatory] The field of Boolean value type is named is_description. For example, the field on the member table that indicates whether a member is enabled is named is_enabled.
  10. [Mandatory] It is forbidden to store large binary data such as pictures and files in the database. Usually the files are very large, causing the data volume to grow rapidly in a short period of time. When the database reads the database, a large amount of random data is usually performed.
    Machine IO operation, when the file is large, IO operation is very time-consuming. Usually stored on a file server, the database only stores file address information.
  11. [Suggestion] Regarding the primary key when creating a table: The table must have a primary key
    • It is mandatory that the primary key is id, the type is int or bigint, and it is auto_increment. It is recommended to use unsigned unsigned type.
    • The field that identifies the subject of each row in the table should not be set as the primary key. It is recommended to set it as other fields such as user_id, order_id, etc., and establish a unique key index. Because if it is set as the primary key and the primary key value is randomly inserted, it will cause innodb internal page splits and a large number of random I/O, resulting in performance degradation.
  12. [Recommendation] The core table (such as the user table) must have the creation time field (create_time) and the last update time field (update_time) of the row data to facilitate troubleshooting.
  13. [Recommendation] All fields in the table should have NOT NULL attributes as much as possible, and the business can define DEFAULT values as needed. Because using NULL values will cause problems such as each row occupying additional storage space, data migration is prone to errors, and aggregate function calculation results are biased.
  14. [Recommendation] All column names and column types that store the same data must be consistent (generally used as related columns, if the related column types are inconsistent during query, the data type will be automatically converted implicitly, which will cause the index on the column to become invalid, resulting in reduced query efficiency).
  15. [Recommendation] The intermediate table (or temporary table) is used to retain the intermediate result set, and the name starts with tmp_.
    The backup table is used to back up or capture a snapshot of the source table, and its name starts with bak_. Intermediate tables and backup tables are cleaned regularly.
  16. [Recommendation] When creating tables, you can use visualization tools. This ensures that all conventions related to tables and fields can be set. In fact, we usually rarely write DDL statements ourselves and can use some visual tools to create and operate databases and data tables. In addition to being convenient, visualization tools can also directly help us convert the structure definition of the database into SQL language to facilitate the export and import of database and data table structures.

3.3 About index

  1. [Mandatory] The primary key of the InnoDB table must be id int/bigint auto_increment, and the primary key value is prohibited from being updated.
  2. [Mandatory] For InnoDB and MyISAM storage engine tables, the index type must be BTREE.
  3. [Recommendation] The name of the primary key starts with pk_, the unique key starts with uni_ or uk_, and the ordinary index starts with idx_. Always use lowercase format, with the name or abbreviation of the field as the suffix.
  4. [Suggestion] For a columnname composed of multiple words, take the first letters of the first few words and add the last word to form column_name. For example: the index on sample table member_id: idx_sample_mid.
  5. [Recommendation] The number of indexes on a single table cannot exceed 6.
  6. [Suggestion] When building an index, consider building a joint index and put the field with the highest distinction first.
  7. [Recommendation] In the SQL of multi-table JOIN, ensure that there is an index on the connection column of the driven table, so that the JOIN execution efficiency is the highest.
  8. [Recommendation] When creating a table or adding an index, ensure that redundant indexes do not exist in the tables. For example: if key(a,b) already exists in the table, key(a) is a redundant index and needs to be deleted.

3.4 SQL writing

  1. [Mandatory] The terminal SELECT statement must specify a specific field name, and writing * is prohibited.
  2. [Suggestion] Specify the specific field name in the insert statement on the terminal. Do not write INSERT INTO t1 VALUES(…).
  3. [Recommendation] Except for static tables or small tables (within 100 rows), DML statements must have WHERE conditions and use index search.
  4. [Recommendation] INSERT INTO…VALUES(XX),(XX),(XX)… The value of XX here should not exceed 5000. Although the value is too high, it will go online quickly, but it will cause master-slave synchronization delay.
  5. [Recommendation] Do not use UNION in the SELECT statement. It is recommended to use UNION ALL, and the number of UNION clauses should be limited to 5.
  6. [Recommendation] In online environment, multi-table JOIN should not exceed 5 tables.
  7. [Recommendation] Reduce the use of ORDER BY, and communicate with the business without sorting without sorting, or put the sorting in the program. ORDER BY, GROUP BY, and DISTINCT statements are relatively CPU-intensive, and the CPU resources of the database are extremely valuable.
  8. [Recommendation] For query statements that include ORDER BY, GROUP BY, and DISTINCT, please keep the result set filtered by the WHERE condition within 1,000 rows, otherwise SQL will be very slow.
  9. [Recommendation] Multiple alter operations on a single table must be merged into one. Alter table on a large table with more than 1 million rows must be reviewed by the DBA and executed during the off-peak period of the business. Multiple alter operations need to be consolidated.
    combine together. Because alter table will generate table lock, all writes to the table will be blocked during the period, which may have a great impact on the business.
  10. [Recommendation] When operating data in batches, it is necessary to control the transaction processing interval and perform necessary sleep.
  11. [Recommendation] The transaction contains no more than 5 SQL statements. Because transactions that are too long will cause problems such as locking data for a long time, MySQL internal cache, excessive connection consumption, etc.
  12. [Recommendation] The update statement in the transaction should be based on the primary key or UNIQUE KEY as much as possible, such as UPDATE… WHERE id=XX; otherwise, gap locks will be generated and the lock range will be expanded internally, resulting in reduced system performance and deadlock.