postgresql|database|materialized view analysis to improve query performance

Foreword:

We generally think that the digital world is a virtual world, OK, but in fact some of our needs are exactly the same as those in the real world. For example, for databases, especially relational databases, we hope that the databases we use can be faster (query speed) , higher (higher performance limit), stronger (concurrency performance, writing ability and other attributes), just like the sports spirit of the Olympics: faster, higher, stronger, but unfortunately, wishes are just wishes. If you want to achieve this The desire requires more technology, ideas.

So, there is no doubt that a materialized view is an object in a relational database specifically targeted at query performance.

Wait, view? Materialized view?

Yes, it is a materialized view. I can only tell you a cruel fact (I will give you the conclusion first, and read on if you are not convinced). Although materialized views can greatly improve the query performance of the database, Materialized views have great limitations. It can also be simply understood that materialized views are a double-edged sword. If used properly, the query speed of the database system can be very fast. If used incorrectly, it may cause tragic consequences such as system crash.

Below, I will give a brief introduction to the advantages and disadvantages of materialized views, applicable scenarios, and how to use materialized views.

One,

What are materialized views?

###Note: The base table, that is, the basic table, the query results do not appear out of thin air. They are naturally the results obtained from one or N tables. One or N tables are also omitted and are called base tables. Showed

Materialized view is a relational database object between tables and views. It can be imagined as a result set generated by querying a base table, but this result set can be very complex and can be a multi-table joint query. The result set can be a simple single-table query result set.

Compared with ordinary views, the more important thing is that materialized views have corresponding physical files stored in the $PGDATA directory. That is to say, it is not an illusory virtual thing, but a real existence. Operable objects, which is why it is called materialized view.

So, many students will have questions: a normal view is also a result set based on a query of a base table. Why should we use a materialized view? Is there something wrong?

OK, ordinary views cannot add indexes, and we all know that indexes can speed up queries. That is, based on appropriate indexes, we can speed up query efficiency. Therefore, a well-designed materialized view query Will be much faster than a well-designed normal view

Compared with tables, materialized views are just a query result set, which naturally does not have the functions of insert and update. In other words, materialized views cannot change the data in them.

Of course, in Navicat, this materialized view is called a materialized view. For now, you just need to understand one thing, it doesn’t matter what it is called

Two,

Advantages, disadvantages and applicable scenarios of materialized views

OK, now let’s summarize the advantages and disadvantages of materialized views

Advantages:

  1. Improving performance: By pre-computing and storing results, you can avoid the need to perform complex join operations or aggregation operations every time you query, thus greatly improving query speed.
  2. Reduced disk space: Because materialized views only store part of the data, rather than having to store all data from all underlying tables, disk space requirements can be reduced.
  3. Support fast data refresh: Most database systems support fast refresh of materialized views, which can update the data in the materialized view in a short period of time to reflect changes in the underlying data in a timely manner.
  4. Materialized views can add indexes, and indexes can effectively improve query efficiency

Disadvantages:

  1. Requires additional storage space: Materialized views require additional disk space to store their result sets.
  2. Update latency: Because materialized views typically require periodic refreshes, changes to the underlying data may take some time to be reflected in the materialized view. Therefore, if the base table is frequently updated and there are high requirements for the accuracy of the materialized view, the update requirement will be an issue that must be considered. Therefore, it is not recommended to update the base table too much. Using materialized views
  3. Maintenance complexity: Because materialized views need to be refreshed regularly and, in some cases, perform complex calculations, they require more hardware resources to support them.
  4. Not suitable for high-concurrency environments: In high-concurrency environments, if multiple users access the materialized view at the same time, lock competition problems may occur, affecting performance. The main reason is that there will be lock problems when operations such as refreshing or synchronizing materialized views.

Due to these advantages and disadvantages, it can be concluded that materialized views are not omnipotent and may be a double-edged sword. Materialized views need to be used in a suitable environment.
OK, the applicable scenarios for materialized views are generally:

  1. The base table is not updated very frequently. This can be quantified in some time. For example, the updated entries are near the minute level.
  2. Compared to materialized views, the degree of concurrency is not too high, and simultaneous queries by multiple users will not affect the normal completion of updates to materialized views.
  3. The hardware conditions of the database are relatively high and can bear frequent updates of materialized views. The main reason is that the CPU and memory can meet the refresh task of materialized views
  4. To provide data to the outside world, for example, A provides data to B. If A prepares the materialized view according to regulations and determines the appropriate rules for refreshing the materialized view, it can provide it to B in the form of compliant data.
  5. Some complex queries are applied more frequently.

Three,

Creation of materialized views

CREATE MATERIALIZED view materialized view name as query statement with DATA

Note: with is followed by data or no data. No data means that this materialized view is not filled in and only the data structure is generated. The default is with data

The following takes pgbench_accounts, a table of pgbench, as an example to illustrate the creation and management of materialized views

Create a materialized view

CREATE MATERIALIZED view vvv as SELECT * FROM pgbench_accounts;

View materialized views:

OK, if the base table pgbench_accounts changes at this time, the materialized view vvv will not change accordingly, because the regulations must refresh (synchronize) the pgbench_accounts table

Modify the abalance value of the basic aid equal to 48 to 123456789. After the modification, query to confirm that it has been modified

UPDATE pgbench_accounts set abalance='123456789' WHERE aid='48'
SELECT * from pgbench_accounts where aid='48'

At this time, query the materialized view vvv, you can see that aid 48 has not changed:

Manually refresh the materialized view:

refresh MATERIALIZED VIEW vvv with data;

Another type of refresh does not affect the use of existing materialized views, that is, parallel refresh without locking. If the materialized view is relatively large:

REFRESH MATERIALIZED VIEW CONCURRENTLY vvv

But refreshing at this time will report an error:

REFRESH MATERIALIZED VIEW CONCURRENTLY vvv
> ERROR: cannot refresh materialized view "public.vvv" concurrently
HINT: Create a unique index with no WHERE clause on one or more columns of the materialized view.

At this time, you need to add a unique index to the materialized view. In this example, add it to aid. Note that the index is also added in parallel CONCURRENTLY:

CREATE UNIQUE INDEX CONCURRENTLY vvvv ON vvv(aid)

Query again and you can see that the materialized view is synchronized with the base table:

Four,

Automatic refresh of materialized views

Automatic updating of materialized views requires the installation of some special plug-ins such as Apache iceberg or manually creating a trigger function + trigger. In this case, it is a trigger function + trigger

Creation of trigger-function (the function to be executed after the trigger is triggered, here naturally refreshes the materialized view):

CREATE OR REPLACE FUNCTION update_my_view()
RETURNS TRIGGER AS $$
DECLARE
    BEGIN
        -- Update the materialized view here.
        REFRESH MATERIALIZED VIEW CONCURRENTLY vvv;
        RETURN NULL;
    END;
$$ LANGUAGE plpgsql;

Creation of trigger (this trigger is based on base table):

CREATE TRIGGER update_my_view_trigger
AFTER INSERT OR UPDATE OR DELETE ON pgbench_accounts
FOR EACH STATEMENT
EXECUTE PROCEDURE update_my_view();

OK, now it can be verified. First, the base table is updated, aid 48 is updated to 888888, and aid 47 is deleted:

UPDATE pgbench_accounts set abalance='888888' WHERE aid='48'
DELETE from pgbench_accounts where aid='47'

When querying the materialized view at this time, you can see that we can see the changes in the materialized view without executing the refresh command:

pgbench=# SELECT * from vvv where aid='47';
 aid | bid | abalance | filler
----- + ----- + ---------- + --------
(0 rows)

pgbench=# SELECT * from vvv where aid='48';
 aid | bid | abalance | filler
----- + ----- + ---------- + ---------------------------- -------------------------------------------------- ----------
  48 | 1 | 888888 |
(1 row)

OK, automatic refresh of materialized view is successful

Five,

Modification of materialized views

Changing a materialized view basically has the same syntax as changing a table, for example, changing the name of a materialized view. It should be noted here that if there is a trigger, the trigger function should also be changed at the same time, otherwise the trigger will report an error: < /strong>

ALTER MATERIALIZED VIEW IF EXISTS vvv
     RENAME TO vvvvvv
UPDATE pgbench_accounts set abalance='8888882' WHERE aid='48'
> ERROR: relation "vvv" does not exist
CONTEXT: SQL statement "REFRESH MATERIALIZED VIEW CONCURRENTLY vvv"
PL/pgSQL function update_my_view() line 5 at SQL statement

Other modifications will not be given one by one:

ALTER MATERIALIZED VIEW [ IF EXISTS ] name
     action [, ... ]
ALTER MATERIALIZED VIEW name
     DEPENDS ON EXTENSION extension_name
ALTER MATERIALIZED VIEW [ IF EXISTS ] name
     RENAME [COLUMN] column_name TO new_column_name
ALTER MATERIALIZED VIEW [ IF EXISTS ] name
     SET SCHEMA new_schema
ALTER MATERIALIZED VIEW ALL IN TABLESPACE name
     [ OWNED BY role_name [, ... ] ]
     SET TABLESPACE new_tablespace [NOWAIT]

The knowledge points of the article match the official knowledge files, and you can further learn related knowledge PostgreSQL skill treeQuery datapsql7076 people are learning the system

syntaxbug.com © 2021 All Rights Reserved.