6. PostgreSQL architecture

1. Postgres related processes

BgWriter: data file writing process
PgStat: Statistics process
SysLogger: System log process
checkpoint: checkpoint
AutoVacumm : Automatically cleans up debris
Archiver: archive process
WalWriter: redo log file writing process.

Shared memory area: similar to SGA;
shared_buffers
wal_buffers
clog_buffers
other buffers

Process private memory area: similar to PGA
temp_buffers
work_mem
maintenance_work_mem

data file
WAL file
Parameter files: postgresql.auto.conf, postgresql.conf, pg_hba.conf 

2. The sequence of requests after the client connects to issue SQL

1). The client makes a connection request
2).postgres generates a background process to serve client requests
3).postgres is the main daemon process of PG:
Guardian:
background writer
wal writer
checkpoint
auto vacuum
stat collector
archiver
4). The background process communicates through shared memory
shared memory: includes:
shared buffers
commit log
wal buffer
5). Background processes have their own local memory for calculation
Local Buffer includes:
temp buffer
maintenance_work_mem
work_mem
6). The reading of disk data is also through shared memory;
Disk:
datafiles
wal logs
arhives
config files
commit log files
pg_control file

3. PostgreSQL architecture model

PostgreSQL uses a client/server, process architecture model.
On the whole, it includes the following parts:
memory area
configuration file
service process
disk storage

4. PostgreSQL server

PostgreSQL server itself is composed of multiple processes
These processes are all maintained by the daemon process (soft link of postmaster) postgres
PG uses a traditional client and server architecture to allow users to access database content.
The client includes the 'psql' client that comes with PG
Or users can use the 'libpq' code library provided by PG to write their own client
There is a layer of shared memory (shared memory) between PG and disk, and most of the data will be placed in shared memory
, because the access speed is fast. The checkpointer process in the PG background is responsible for periodically saving data from shared memory to disk.

5. Shared memory area

shared_buffers
wal_buffers
clog_buffers
other buffers

(1) shared_buffers --- shared buffer
It represents the number of data blocks in the data buffer, and the size of each data block is 8KB.
The data buffer is located in the shared memory of the database, the bigger the better, it cannot be smaller than 128KB.
This parameter can only be set when starting the database.
The default value is 128MB.
Recommended value: 1/4 host physical memory

(2)wal_buffers --- the size of the log buffer
It can reduce IO. If you encounter more concurrent short transactions, you should use it together with commit_delay
The size of the memory space for storing WAL data, the system default value is 64K

6. Process private memory area

temp_buffers
work_mem
maintenance_work_mem

(1) temp_buffers --- temporary buffer
It is used to store the temporary table data accessed by the database session, and the system default value is 8M.
This parameter can be set in a separate session, especially when a large temporary table needs to be accessed, which will have a significant performance improvement

(2) work_mem --- working memory or operating memory.
It is responsible for the internal sort and hash operations, and the appropriate work_mem size can ensure that these operations are performed in memory.

(3) maintenance_work_mem --- maintenance work memory
Mainly for database maintenance operations or statements.
Mainly for VACUUM, CREATE INDEX, ALTER TABLE ADD FOREIGN KEY and other operations.
It is necessary to properly adjust this parameter when VACUUMing the entire database or rebuilding a large index
The postresql documentation suggests that this parameter cannot be configured too large when the autoacuum function is enabled

7. Main configuration file

Main configuration files: postgresql.auto.conf, postgresql.conf, pg_hba.conf

pg_hba.conf: Stores the permission configuration related to the connection, and the file is stored in the Data directory.
Each line in the file is a configuration, and each configuration includes several fields such as connection type, database, user, client address and authentication method
postgresql.conf: The configuration parameters of Pgsql are centrally managed in this file, and the file is stored in the Data directory.
The format of each parameter configuration item is "parameter name = parameter value"
postgresql.auto.conf has the same format as postgresql.conf, but does not allow manual editing.
This file holds the parameters set by the ALTER SYSTEM command.
The postgresql.auto.conf file is automatically read when postgresql.conf is read, and its settings take effect in the same way.
Settings override settings in postgresql.conf. 

8. Process communication – establish a session

The process of establishing a session
Phase 1: The client initiates a request
Phase 2: This phase is in charge of the main service process (soft link of postmaster) postgres
Whether the server accepts the client's host communication authentication
The server authenticates the client
Phase 3: The client establishes a communication connection with the postgres process, and the postgres process is responsible for all subsequent client request operations.
At the same time postgres as a background daemon process. 

9. Physical structure

Datafiles;
table
Toast (large field)
Index
sequence
Controlfile
Archived
WALs

10. Physical structure – data storage method

Directory structure
The data files of the default tablespace are stored in the base directory
Save the WAL log in pg_xlog, and its location can only be changed through soft links (named pg_wal after version 10)
Segments
Each table and index is stored in a separate data file
The file name is the file node (filenode) number of the table or index
If the table or index exceeds 1 GB it will be split into multiple segments
The first segment is named after the file node number, and the second and subsequent segments are named in the form of filenode.1, filenode.2

Page(Block) and Buffer
It is called page in disk and buffer in memory
The default is 8k, you can specify the block_size parameter to change the size at compile time
Relation: Represents a table or index
Tuple=row row in the table

11. How to view the directory structure

postgres=# show data_directory;
   data_directory
--------------------
 /postgresql/pgdata

select name, setting from pg_settings where category='File Locations';
postgres=# select name, setting from pg_settings where category='File Locations';
       name | setting
------------------- + ------------------------------ ------
 config_file | /postgresql/pgdata/postgresql.conf
 data_directory | /postgresql/pgdata
 external_pid_file |
 hba_file | /postgresql/pgdata/pg_hba.conf
 ident_file | /postgresql/pgdata/pg_ident.conf

12. Physical structure – directory meaning

Object Type Description
The base directory stores the database and data files
The global directory stores system tables at the cluster (instance) level
###pg_clog directory stores transaction commit status (pg12.2 does not have this directory)
The pg_commit_ts directory stores transaction commit time: pg12.2 is empty.
The pg_dynshmem directory stores related files of the dynamic shared memory subsystem: pg12.2 is empty.
pg_log directory Default database log storage directory: startup log.
The pg_logical directory stores the state of logical decoding
The pg_multixact directory stores the state of concurrent transactions (for shared row locks)
The pg_notify directory stores the status of listen/notify
The pg_replslot directory stores replication slot/bit data: space for a stand-alone machine.
The pg_serial directory stores information about serialized transactions: empty by default.
The pg_snapshots directory stores exported snapshots: empty by default.
The pg_stat directory stores persistent files for the statistics subsystem: empty by default
The pg_stat_tmp directory stores temporary files for the statistics subsystem
The pg_subtrans directory stores the state of subtransactions
The pg_tblspc directory stores soft connections pointing to the actual tablespace
The pg_twophase directory stores state about pre-committed transactions: empty by default.
The pg_xlog directory stores online transaction logs
####### File section.
PG_VERSION file The database version file to which the cluster belongs
postgresql.conf file database parameter configuration file
postgresql.auto.conf file database parameter configuration file
pg_hba.conf file access configuration file, focusing on access methods
pg_ident.conf file access configuration file, focusing on identity authentication and user mapping
postmaster.opts file The command-line options used when the cluster was last started
postmaster.pid file After the cluster starts, the generated lock file
##recovery.conf file Configuration file required in recovery or replication: pg12 does not have this. 

13. File directory tree

[pgsql@postgresql:/postgresql]$tree -L 3
.
|-- 1.txt
|-- archive
|-- backup
|-- pg12
| |-- bin
| | |-- clusterdb
| | |--createdb
| | |-- createuser
| | |-- dropdb
| | |-- dropuser
| | |-- ecpg
| | |-- initdb
| | |-- pg_archivecleanup
| | |-- pg_basebackup
| | |-- pgbench
| | |-- pg_checksums
| | |-- pg_config
| | |-- pg_controldata
| | |-- pg_ctl
| | |-- pg_dump
| | |-- pg_dumpall
| | |-- pg_isready
| | |-- pg_receivewal
| | |-- pg_recvlogical
| | |-- pg_resetwal
| | |-- pg_restore
| | |-- pg_rewind
| | |-- pg_test_fsync
| | |-- pg_test_timing
| | |-- pg_upgrade
| | |-- pg_waldump
| | |-- postgres
| | |-- postmaster -> postgres
| | |-- psql
| | |-- reindexdb
| | `--vacuumdb
| |-- include
| |--lib
| `-- share
| |-- errcodes.txt
| |-- extension
| |-- information_schema.sql
| |-- pg_hba.conf.sample
| |-- pg_ident.conf.sample
| |-- pg_service.conf.sample
| |-- postgres.bki
| |-- postgres.description
| |-- postgresql.conf.sample
| |-- postgres.shdescription
| |-- psqlrc.sample
| |-- snowball_create.sql
| |-- sql_features.txt
| |-- system_views.sql
| |-- timezone
| |-- timezonesets
| `-- tsearch_data
|-- pgdata
| |-- base
| | |-- 1
| | |-- 13592
| | |-- 13593
| | `-- 16385
| |-- current_logfiles
| |-- global
| | |-- 1136
| | |-- 1136_fsm
| | |-- 1136_vm
| | |-- 1137
| | |-- 1213
| | |-- 1213_fsm
| | |-- 1213_vm
| | |-- 1214
| | |-- 1214_fsm
| | |-- 1214_vm
| | |-- 1232
| | |-- 1233
| | |-- 1260
| | |-- 1260_fsm
| | |-- 1260_vm
| | |-- 1261
| | |-- 1261_fsm
| | |-- 1261_vm
| | |-- 1262
| | |-- 1262_fsm
| | |-- 1262_vm
| | |-- 2396
| | |-- 2396_fsm
| | |-- 2396_vm
| | |-- 2397
| | |-- 2671
| | |-- 2672
| | |-- 2676
| | |-- 2677
| | |-- 2694
| | |-- 2695
| | |-- 2697
| | |-- 2698
| | |-- 2846
| | |-- 2847
| | |-- 2964
| | |-- 2965
| | |-- 2966
| | |-- 2967
| | |-- 3592
| | |-- 3593
| | |-- 4060
| | |-- 4061
| | |-- 4175
| | |-- 4176
| | |-- 4177
| | |-- 4178
| | |-- 4179
| | |-- 4180
| | |-- 4181
| | |-- 4182
| | |-- 4183
| | |-- 4184
| | |-- 4185
| | |-- 4186
| | |-- 6000
| | |-- 6001
| | |-- 6002
| | |-- 6100
| | |-- 6114
| | |-- 6115
| | |-- pg_control
| | |-- pg_filenode.map
| | `-- pg_internal.init
| |-- log
| | |-- postgresql-2023-03-23_080736.csv
| | |-- postgresql-2023-03-23_080736.log
| | |-- postgresql-2023-03-23_081149.csv
| | `-- postgresql-2023-03-23_081149.log
| |-- pg_commit_ts
| |-- pg_dynshmem
| |-- pg_hba.conf
| |-- pg_ident.conf
| |-- pg_log
| | |-- postgresql-2022-12-22_053850.log
| | |-- postgresql-2022-12-22_054258.log
| | |-- postgresql-2022-12-22_054418.log
| | |-- postgresql-2022-12-22_054714.log
| | |-- postgresql-2022-12-22_054739.log
| | |-- postgresql-2022-12-22_055002.log
| | |-- postgresql-2022-12-22_055452.log
| | |-- postgresql-2022-12-22_055941.log
| | |-- postgresql-2022-12-22_060338.log
| | |-- postgresql-2022-12-22_060546.log
| | |-- postgresql-2022-12-22_060654.log
| | |-- postgresql-2023-02-17_150135.log
| | |-- postgresql-2023-03-03_053926.log
| | |-- postgresql-2023-03-03_054248.log
| | |-- postgresql-2023-03-03_091820.log
| | |-- postgresql-2023-03-04_011734.log
| | |-- postgresql-2023-03-23_072237.log
| | `-- postgresql-2023-03-23_073641.log
| |-- pg_logical
| | |-- mappings
| | |-- replorigin_checkpoint
| | `-- snapshots
| |-- pg_multixact
| | |-- members
| | `-- offsets
| |-- pg_notify
| | `-- 0000
| |-- pg_replslot
| |-- pg_serial
| |-- pg_snapshots
| |-- pg_stat
| |-- pg_stat_tmp
| | |-- db_0.stat
| | |-- db_13593.stat
| | `-- global.stat
| |-- pg_subtrans
| | `-- 0000
| |-- pg_tblspc
| |-- pg_twophase
| |-- PG_VERSION
| |-- pg_wal
| | |-- 000000010000000000000001
| | `-- archive_status
| |-- pg_xact
| | `-- 0000
| |-- postgresql.auto.conf
| |-- postgresql.conf
| |-- postgresql.conf.20221221
| |-- postmaster.opts
| `-- postmaster.pid

14. Background process

pgsql 2354 1 0 16:11 ? 00:00:00 /postgresql/pg12/bin/postgres
pgsql 2355 2354 0 16:11 ? 00:00:00 postgres: logger
pgsql 2357 2354 0 16:11 ? 00:00:00 postgres: checkpointer
pgsql 2358 2354 0 16:11 ? 00:00:00 postgres: background writer
pgsql 2359 2354 0 16:11 ? 00:00:00 postgres: walwriter
pgsql 2360 2354 0 16:11 ? 00:00:00 postgres: autovacuum launcher
pgsql 2361 2354 0 16:11 ? 00:00:00 postgres: archiver
pgsql 2362 2354 0 16:11 ? 00:00:00 postgres: stats collector
pgsql 2363 2354 0 16:11 ? 00:00:00 postgres: logical replication launcher
pgsql 2364 2354 0 16:12 ? 00:00:00 postgres: postgres postgres [local] idle

postgres is the soft connection of postmaster. 

15. The method of viewing the process tree.

[pgsql@postgresql:/home/pgsql]$pstree -p 2354
postgres(2354)─┬─postgres(2355)
               ├─postgres(2357)
               ├─postgres(2358)
               ├─postgres(2359)
               ├─postgres(2360)
               ├─postgres(2361)
               ├─postgres(2362)
               ├─postgres(2363)
               └─postgres(2364)

16. Summary

none