1. Postgres related processes
BgWriter: data file writing process PgStat: Statistics process SysLogger: System log process checkpoint: checkpoint AutoVacumm : Automatically cleans up debris Archiver: archive process WalWriter: redo log file writing process. Shared memory area: similar to SGA; shared_buffers wal_buffers clog_buffers other buffers Process private memory area: similar to PGA temp_buffers work_mem maintenance_work_mem data file WAL file Parameter files: postgresql.auto.conf, postgresql.conf, pg_hba.conf
2. The sequence of requests after the client connects to issue SQL
1). The client makes a connection request 2).postgres generates a background process to serve client requests 3).postgres is the main daemon process of PG: Guardian: background writer wal writer checkpoint auto vacuum stat collector archiver 4). The background process communicates through shared memory shared memory: includes: shared buffers commit log wal buffer 5). Background processes have their own local memory for calculation Local Buffer includes: temp buffer maintenance_work_mem work_mem 6). The reading of disk data is also through shared memory; Disk: datafiles wal logs arhives config files commit log files pg_control file
3. PostgreSQL architecture model
PostgreSQL uses a client/server, process architecture model. On the whole, it includes the following parts: memory area configuration file service process disk storage
4. PostgreSQL server
PostgreSQL server itself is composed of multiple processes These processes are all maintained by the daemon process (soft link of postmaster) postgres PG uses a traditional client and server architecture to allow users to access database content. The client includes the 'psql' client that comes with PG Or users can use the 'libpq' code library provided by PG to write their own client There is a layer of shared memory (shared memory) between PG and disk, and most of the data will be placed in shared memory , because the access speed is fast. The checkpointer process in the PG background is responsible for periodically saving data from shared memory to disk.
5. Shared memory area
shared_buffers wal_buffers clog_buffers other buffers (1) shared_buffers --- shared buffer It represents the number of data blocks in the data buffer, and the size of each data block is 8KB. The data buffer is located in the shared memory of the database, the bigger the better, it cannot be smaller than 128KB. This parameter can only be set when starting the database. The default value is 128MB. Recommended value: 1/4 host physical memory (2)wal_buffers --- the size of the log buffer It can reduce IO. If you encounter more concurrent short transactions, you should use it together with commit_delay The size of the memory space for storing WAL data, the system default value is 64K
6. Process private memory area
temp_buffers work_mem maintenance_work_mem (1) temp_buffers --- temporary buffer It is used to store the temporary table data accessed by the database session, and the system default value is 8M. This parameter can be set in a separate session, especially when a large temporary table needs to be accessed, which will have a significant performance improvement (2) work_mem --- working memory or operating memory. It is responsible for the internal sort and hash operations, and the appropriate work_mem size can ensure that these operations are performed in memory. (3) maintenance_work_mem --- maintenance work memory Mainly for database maintenance operations or statements. Mainly for VACUUM, CREATE INDEX, ALTER TABLE ADD FOREIGN KEY and other operations. It is necessary to properly adjust this parameter when VACUUMing the entire database or rebuilding a large index The postresql documentation suggests that this parameter cannot be configured too large when the autoacuum function is enabled
7. Main configuration file
Main configuration files: postgresql.auto.conf, postgresql.conf, pg_hba.conf pg_hba.conf: Stores the permission configuration related to the connection, and the file is stored in the Data directory. Each line in the file is a configuration, and each configuration includes several fields such as connection type, database, user, client address and authentication method postgresql.conf: The configuration parameters of Pgsql are centrally managed in this file, and the file is stored in the Data directory. The format of each parameter configuration item is "parameter name = parameter value" postgresql.auto.conf has the same format as postgresql.conf, but does not allow manual editing. This file holds the parameters set by the ALTER SYSTEM command. The postgresql.auto.conf file is automatically read when postgresql.conf is read, and its settings take effect in the same way. Settings override settings in postgresql.conf.
8. Process communication – establish a session
The process of establishing a session Phase 1: The client initiates a request Phase 2: This phase is in charge of the main service process (soft link of postmaster) postgres Whether the server accepts the client's host communication authentication The server authenticates the client Phase 3: The client establishes a communication connection with the postgres process, and the postgres process is responsible for all subsequent client request operations. At the same time postgres as a background daemon process.
9. Physical structure
Datafiles; table Toast (large field) Index sequence Controlfile Archived WALs
10. Physical structure – data storage method
Directory structure The data files of the default tablespace are stored in the base directory Save the WAL log in pg_xlog, and its location can only be changed through soft links (named pg_wal after version 10) Segments Each table and index is stored in a separate data file The file name is the file node (filenode) number of the table or index If the table or index exceeds 1 GB it will be split into multiple segments The first segment is named after the file node number, and the second and subsequent segments are named in the form of filenode.1, filenode.2 Page(Block) and Buffer It is called page in disk and buffer in memory The default is 8k, you can specify the block_size parameter to change the size at compile time Relation: Represents a table or index Tuple=row row in the table
11. How to view the directory structure
postgres=# show data_directory; data_directory -------------------- /postgresql/pgdata select name, setting from pg_settings where category='File Locations'; postgres=# select name, setting from pg_settings where category='File Locations'; name | setting ------------------- + ------------------------------ ------ config_file | /postgresql/pgdata/postgresql.conf data_directory | /postgresql/pgdata external_pid_file | hba_file | /postgresql/pgdata/pg_hba.conf ident_file | /postgresql/pgdata/pg_ident.conf
12. Physical structure – directory meaning
Object Type Description The base directory stores the database and data files The global directory stores system tables at the cluster (instance) level ###pg_clog directory stores transaction commit status (pg12.2 does not have this directory) The pg_commit_ts directory stores transaction commit time: pg12.2 is empty. The pg_dynshmem directory stores related files of the dynamic shared memory subsystem: pg12.2 is empty. pg_log directory Default database log storage directory: startup log. The pg_logical directory stores the state of logical decoding The pg_multixact directory stores the state of concurrent transactions (for shared row locks) The pg_notify directory stores the status of listen/notify The pg_replslot directory stores replication slot/bit data: space for a stand-alone machine. The pg_serial directory stores information about serialized transactions: empty by default. The pg_snapshots directory stores exported snapshots: empty by default. The pg_stat directory stores persistent files for the statistics subsystem: empty by default The pg_stat_tmp directory stores temporary files for the statistics subsystem The pg_subtrans directory stores the state of subtransactions The pg_tblspc directory stores soft connections pointing to the actual tablespace The pg_twophase directory stores state about pre-committed transactions: empty by default. The pg_xlog directory stores online transaction logs ####### File section. PG_VERSION file The database version file to which the cluster belongs postgresql.conf file database parameter configuration file postgresql.auto.conf file database parameter configuration file pg_hba.conf file access configuration file, focusing on access methods pg_ident.conf file access configuration file, focusing on identity authentication and user mapping postmaster.opts file The command-line options used when the cluster was last started postmaster.pid file After the cluster starts, the generated lock file ##recovery.conf file Configuration file required in recovery or replication: pg12 does not have this.
13. File directory tree
[pgsql@postgresql:/postgresql]$tree -L 3 . |-- 1.txt |-- archive |-- backup |-- pg12 | |-- bin | | |-- clusterdb | | |--createdb | | |-- createuser | | |-- dropdb | | |-- dropuser | | |-- ecpg | | |-- initdb | | |-- pg_archivecleanup | | |-- pg_basebackup | | |-- pgbench | | |-- pg_checksums | | |-- pg_config | | |-- pg_controldata | | |-- pg_ctl | | |-- pg_dump | | |-- pg_dumpall | | |-- pg_isready | | |-- pg_receivewal | | |-- pg_recvlogical | | |-- pg_resetwal | | |-- pg_restore | | |-- pg_rewind | | |-- pg_test_fsync | | |-- pg_test_timing | | |-- pg_upgrade | | |-- pg_waldump | | |-- postgres | | |-- postmaster -> postgres | | |-- psql | | |-- reindexdb | | `--vacuumdb | |-- include | |--lib | `-- share | |-- errcodes.txt | |-- extension | |-- information_schema.sql | |-- pg_hba.conf.sample | |-- pg_ident.conf.sample | |-- pg_service.conf.sample | |-- postgres.bki | |-- postgres.description | |-- postgresql.conf.sample | |-- postgres.shdescription | |-- psqlrc.sample | |-- snowball_create.sql | |-- sql_features.txt | |-- system_views.sql | |-- timezone | |-- timezonesets | `-- tsearch_data |-- pgdata | |-- base | | |-- 1 | | |-- 13592 | | |-- 13593 | | `-- 16385 | |-- current_logfiles | |-- global | | |-- 1136 | | |-- 1136_fsm | | |-- 1136_vm | | |-- 1137 | | |-- 1213 | | |-- 1213_fsm | | |-- 1213_vm | | |-- 1214 | | |-- 1214_fsm | | |-- 1214_vm | | |-- 1232 | | |-- 1233 | | |-- 1260 | | |-- 1260_fsm | | |-- 1260_vm | | |-- 1261 | | |-- 1261_fsm | | |-- 1261_vm | | |-- 1262 | | |-- 1262_fsm | | |-- 1262_vm | | |-- 2396 | | |-- 2396_fsm | | |-- 2396_vm | | |-- 2397 | | |-- 2671 | | |-- 2672 | | |-- 2676 | | |-- 2677 | | |-- 2694 | | |-- 2695 | | |-- 2697 | | |-- 2698 | | |-- 2846 | | |-- 2847 | | |-- 2964 | | |-- 2965 | | |-- 2966 | | |-- 2967 | | |-- 3592 | | |-- 3593 | | |-- 4060 | | |-- 4061 | | |-- 4175 | | |-- 4176 | | |-- 4177 | | |-- 4178 | | |-- 4179 | | |-- 4180 | | |-- 4181 | | |-- 4182 | | |-- 4183 | | |-- 4184 | | |-- 4185 | | |-- 4186 | | |-- 6000 | | |-- 6001 | | |-- 6002 | | |-- 6100 | | |-- 6114 | | |-- 6115 | | |-- pg_control | | |-- pg_filenode.map | | `-- pg_internal.init | |-- log | | |-- postgresql-2023-03-23_080736.csv | | |-- postgresql-2023-03-23_080736.log | | |-- postgresql-2023-03-23_081149.csv | | `-- postgresql-2023-03-23_081149.log | |-- pg_commit_ts | |-- pg_dynshmem | |-- pg_hba.conf | |-- pg_ident.conf | |-- pg_log | | |-- postgresql-2022-12-22_053850.log | | |-- postgresql-2022-12-22_054258.log | | |-- postgresql-2022-12-22_054418.log | | |-- postgresql-2022-12-22_054714.log | | |-- postgresql-2022-12-22_054739.log | | |-- postgresql-2022-12-22_055002.log | | |-- postgresql-2022-12-22_055452.log | | |-- postgresql-2022-12-22_055941.log | | |-- postgresql-2022-12-22_060338.log | | |-- postgresql-2022-12-22_060546.log | | |-- postgresql-2022-12-22_060654.log | | |-- postgresql-2023-02-17_150135.log | | |-- postgresql-2023-03-03_053926.log | | |-- postgresql-2023-03-03_054248.log | | |-- postgresql-2023-03-03_091820.log | | |-- postgresql-2023-03-04_011734.log | | |-- postgresql-2023-03-23_072237.log | | `-- postgresql-2023-03-23_073641.log | |-- pg_logical | | |-- mappings | | |-- replorigin_checkpoint | | `-- snapshots | |-- pg_multixact | | |-- members | | `-- offsets | |-- pg_notify | | `-- 0000 | |-- pg_replslot | |-- pg_serial | |-- pg_snapshots | |-- pg_stat | |-- pg_stat_tmp | | |-- db_0.stat | | |-- db_13593.stat | | `-- global.stat | |-- pg_subtrans | | `-- 0000 | |-- pg_tblspc | |-- pg_twophase | |-- PG_VERSION | |-- pg_wal | | |-- 000000010000000000000001 | | `-- archive_status | |-- pg_xact | | `-- 0000 | |-- postgresql.auto.conf | |-- postgresql.conf | |-- postgresql.conf.20221221 | |-- postmaster.opts | `-- postmaster.pid
14. Background process
pgsql 2354 1 0 16:11 ? 00:00:00 /postgresql/pg12/bin/postgres pgsql 2355 2354 0 16:11 ? 00:00:00 postgres: logger pgsql 2357 2354 0 16:11 ? 00:00:00 postgres: checkpointer pgsql 2358 2354 0 16:11 ? 00:00:00 postgres: background writer pgsql 2359 2354 0 16:11 ? 00:00:00 postgres: walwriter pgsql 2360 2354 0 16:11 ? 00:00:00 postgres: autovacuum launcher pgsql 2361 2354 0 16:11 ? 00:00:00 postgres: archiver pgsql 2362 2354 0 16:11 ? 00:00:00 postgres: stats collector pgsql 2363 2354 0 16:11 ? 00:00:00 postgres: logical replication launcher pgsql 2364 2354 0 16:12 ? 00:00:00 postgres: postgres postgres [local] idle postgres is the soft connection of postmaster.
15. The method of viewing the process tree.
[pgsql@postgresql:/home/pgsql]$pstree -p 2354 postgres(2354)─┬─postgres(2355) ├─postgres(2357) ├─postgres(2358) ├─postgres(2359) ├─postgres(2360) ├─postgres(2361) ├─postgres(2362) ├─postgres(2363) └─postgres(2364)
16. Summary
none