The impact of manual vacuuming of partition tables with global partition indexes in KingbaseESV8R6

Background

There is such a case at the customer site. There is a large table with 500 partitions, and each partition has 200,000 records. There are updates very frequently and autovacuum is often triggered. Since the table is very large, autovacuum takes a long time. According to feedback from on-site colleagues, when manually vacuum partitioning, an error message was reported: “multiple backends attempting to wait for pincount 1”, which roughly means that the manual vacuum and autovacuum processes conflict. After querying the sys_stat_activity view, it turns out that the three default autovacuum processes are all busy. You can view the autovacuum_max_workers parameter. The default is 3, which indicates the number of processes that autovacuum is working on at the same time.

Faced with such a problem, you should first avoid the existence of this kind of database architecture according to the business type before application planning and deployment. For example, split the business, or increase parameter settings, storage performance, etc. in advance to avoid this risk. Currently, you can Increase the number of autovacuum_max_workers processes. However, increasing this parameter may cause excessive IO pressure. This requires appropriate adjustment of this parameter in conjunction with monitoring system resource usage. Since the customer’s consent is required, the current solution is to close the autovacuum process and then enable manual vacuum partition table. Next, we need to verify that we have a local partition index and a global partition index. Impact of manual vacuuming of partition tables.

Experimental environment, close the autovacuum process:

Test

Verify the partition table of the local partition index, and vacuum both partitions to see if there are locks waiting.

create table part_tab01(part_key char(1),state char(1),desc_content text)
partition by range(part_key)
(
  partition part_0 values less than(1),
  partition part_1 values less than(2)
);

insert into part_tab01 select '0','0',repeat('a',1000) from generate_series(1,500000);
insert into part_tab01 select '0','1',repeat('b',1000) from generate_series(1,10);
insert into part_tab01 select '1','1',repeat('a',1000) from generate_series(1,500000);
insert into part_tab01 select '1','0',repeat('b',1000) from generate_series(1,10);


create index idx_part_tab01_state on part_tab01(state);

select count(*) from part_tab01_part_0;
select count(*) from part_tab01_part_1;


alter system set autovacuum=off;
select sys_reload_conf();
show autovacuum;

update part_tab01_part_0 set desc_content='aaa';
update part_tab01_part_1 set desc_content='bbb';

Check partition expansion rate
  SELECT
     schemaname||'.'||relname,
     n_dead_tup,
     n_live_tup,
     round(n_dead_tup * 100 / (n_live_tup + n_dead_tup),2) AS dead_tup_ratio
 FROM
     pg_stat_all_tables
 WHERE relname='part_tab01_part_0'
 ORDER BY dead_tup_ratio DESC;
         ?column? | n_dead_tup | n_live_tup | dead_tup_ratio
-------------------------- + ------------ + ---------- -- +----------------
 public.part_tab01_part_0 | 100010 | 100010 | 50.00
(1 row)

  SELECT
     schemaname||'.'||relname,
     n_dead_tup,
     n_live_tup,
     round(n_dead_tup * 100 / (n_live_tup + n_dead_tup),2) AS dead_tup_ratio
 FROM
     pg_stat_all_tables
 WHERE relname='part_tab01_part_1'
 ORDER BY dead_tup_ratio DESC;
         ?column? | n_dead_tup | n_live_tup | dead_tup_ratio
-------------------------- + ------------ + ---------- -- +----------------
 public.part_tab01_part_1 | 100010 | 100010 | 50.00
(1 row)

Execute the following vacuum commands in two windows at the same time.
vacuum part_tab01_part_0;
vacuum part_tab01_part_1;
There is no lock waiting, and vacuum of two partitions can be completed at the same time

Verify the partition table of the global partition index and vacuum whether there are locks waiting for both partitions.

create table t1_part(id1 integer,id2 integer,id3 integer)
partition by range(id2)
(
  partition part01 values less than(10000),
  partition part02 values less than(20000),
  partition part03 values less than(30000),
  partition part04 values less than(40000),
  partition part05 values less than(50000),
  partition part06 values less than(60000),
  partition part07 values less than(70000),
  partition part08 values less than(80000),
  partition part09 values less than(90000),
  partition part10 values less than(maxvalue)
);
 
create unique index idx2_t1_part on t1_part(id2) global;
create index idx3_t1_part on t1_part(id2);

View index
TEST=# \di + idx2_t1_part
                                 List of relations
 Schema | Name | Type | Owner | Table | Size | Description
-------- + -------------- + -------------- + -------- + -- ------- + ------------ + -------------
 public | idx2_t1_part | global index | system | t1_part | 8192 bytes |
(1 row)

TEST=# \di + idx3_t1_part;
                                  List of relations
 Schema | Name | Type | Owner | Table | Size | Description
-------- + -------------- + ------------------- + ------ -- + --------- + --------- + -------------
 public | idx3_t1_part | partitioned index | system | t1_part | 0 bytes |
(1 row)


insert into t1_part select generate_series(1,100000),generate_series(1,100000),generate_series(1,100000);

update t1_part set id3='443';


Use the following statement to query the expansion rate of each partition, which has already exceeded the autovacuum threshold. Since there are many partitions, the result is skipped.
  SELECT
     schemaname||'.'||relname,
     n_dead_tup,
     n_live_tup,
     round(n_dead_tup * 100 / (n_live_tup + n_dead_tup),2) AS dead_tup_ratio
 FROM
     pg_stat_all_tables
 WHERE relname ='t1_part_part01'
 ORDER BY dead_tup_ratio DESC;
 
Execute the following vacuum commands in multiple windows at the same time
vacuum t1_part_part01
vacuum t1_part_part02
vacuum t1_part_part03
Sure enough, when executing subpartitions concurrently, the global partition index reports an error and concurrent vacuuming of subpartitions is not allowed. The reason is that when vacuuming a partition table, it also includes a vacuum index operation. Vacuum multiple partition tables at the same time, which is caused by vacuuming the same global partition index. wait.
TEST=# vacuum t1_part_part02
TEST-# ;
ERROR: multiple active vacuums for index "idx2_t1_part"


At this time, only vacuuming the primary partition table can succeed
TEST=# vacuum t1_part;
VACUUM


Check the expansion rate of each partition table separately and it has dropped to 0. Since there are many partitions, only two partitions are listed here.
TEST=# SELECT
TEST-# schemaname||'.'||relname,
TEST-# n_dead_tup,
TEST-# n_live_tup,
TEST-# round(n_dead_tup * 100 / (n_live_tup + n_dead_tup),2) AS dead_tup_ratio
TEST-# FROM
TEST-# pg_stat_all_tables
TEST-# WHERE relname ='t1_part_part05'
TEST-# ORDER BY dead_tup_ratio DESC;
       ?column? | n_dead_tup | n_live_tup | dead_tup_ratio
----------------------- + ------------ + ------------ + ----------------
 public.t1_part_part05 | 0 | 10000 | 0.00
(1 row)

TEST=# SELECT
TEST-# schemaname||'.'||relname,
TEST-# n_dead_tup,
TEST-# n_live_tup,
TEST-# round(n_dead_tup * 100 / (n_live_tup + n_dead_tup),2) AS dead_tup_ratio
TEST-# FROM
TEST-# pg_stat_all_tables
TEST-# WHERE relname ='t1_part_part06'
TEST-# ORDER BY dead_tup_ratio DESC;
       ?column? | n_dead_tup | n_live_tup | dead_tup_ratio
----------------------- + ------------ + ------------ + ----------------
 public.t1_part_part06 | 0 | 10000 | 0.00
(1 row)

Summary

  1. When vacuuming a table, you need to obtain a ShareUpdateExclusiveLock lock. However, two ShareUpdateExclusiveLock locks on the same table conflict and two processes are not allowed to vacuum the same table.
  2. For the partition table parent table vacuum, the actual partition table is vacuumed.
    For a partitioned table with a global partition index, when manually vacuating the table, first vacuum the main table, and then need to use the INDEX_CLEANUP feature, vacuum (verbose, INDEX_CLEANUP false) t1_part; is especially applicable to this case. When there is a global partition index, the table and When the index expansion rate is very high, it is recommended to skip the index when vacuating the table, and finally rebuild the index after vacuating the table.
    For this part, please refer to the blog garden document “KingbaseES V8R6 vacuum index_cleanup option” https://www.cnblogs.com/kingbase/p/16145397.html