Nginx (05)-http working mechanism, configuration instructions and built-in variables

The http service is the most original service of Nginx. Understanding its working mechanism is very helpful to understand how nginx works.
The core module of Nginx is ngx_http_core_module.

Table of Contents

http working mechanism

Configuration structure

Working Mechanism

http common commands

http

server

listen

server_name

location

priority

Special usage of “/”

root/alias/index

root

alias

index

default_type

keepalive_requests

keepalive_timeout

send_timeout

client_max_body_size

built-in variables

http working mechanism

Configuration structure

The following is the three-layer structure of http configuration:

http{
    ...
    server{
        ...
        location/{
            ...
        }
    }
}

Configuration relationship: There is only one http, one http contains multiple servers, and one server contains multiple locations.

Working mechanism

When Nginx receives an http(s) request, the processing steps are as follows:
1. Match by server, matching rules: Determine which server the request should be routed to based on the header field “Host” of the request. If its value does not match any server_name, or the request does not contain this header field at all, then nginx will route the request to the default server for this port. The default server is the first server, which is the standard default behavior of nginx. It can also be set explicitly using the default_server parameter in the listen directive (the default server is set by port).

server {
    listen 80;
    server_name example.com www.example.com;
    ...
    location /hello/ {
        ...
    }
}
server {
    listen 80 default_server;
    server_name example.net www.example.net;
    ...
    location /hello/ {
        ...
    }
}

In the above configuration, the default server is the second one; if there is no default_server, it is the first one.
2. After determining the server, match the location according to the location rules.
3. Work according to the instructions in the positioned location.

http common commands

http

http{ … } provides a configuration context in which HTTP service-related directives are specified.

server

server { … } provides a configuration context in which virtual server directives are specified.

listen

Format: listen address[:port] [default_server] [ssl] [http2 | quic] [proxy_protocol] [setfib=number] [fastopen=number] [backlog=number] [rcvbuf=size] [sndbuf=size] [accept_filter= filter] [deferred] [bind] [ipv6only=on|off] [reuseport] [so_keepalive=on|off|[keepidle]:[keepintvl]:[keepcnt]];
listen port [default_server]…;
listen unix:path …;
Default: listen *:80 | *:8000;
Parameter Description:
address[:port]: Sets the address and port of the IP, or path to the UNIX domain socket on which the server will accept requests. You can specify both address and port, or only the address or only the port. The address can also be a hostname. like:

listen 127.0.0.1:8000;
#No port specified, default is 80
listen 127.0.0.1;
listen 80;
listen *:8000;
listen localhost:8000;

default_server: Will make the server the default server matching address[:port].
ssl: Allow specifies that all connections accepted on this port should work in ssl mode. This allows for a more compact configuration of servers that handle both HTTP and HTTPS requests.
http2: Configure the port to accept HTTP/2 connections. Normally, to achieve this, the ssl parameter should also be specified, but nginx can also be configured to accept HTTP/2 connections without ssl.
quic: Configure the port to accept quic connections.
proxy_protocol: Allows specifying that all connections accepted on this port should use the proxy protocol.
setfib=number: Set the related routing table FIB of the listening socket (SO_SETFIB option). This currently only works on FreeBSD.
fastopen=number : Enables the “TCP Fast Open” feature for the listening socket and limits the maximum length of the connection queue that has not yet completed the three-way handshake. Do not enable this feature unless the server can handle receiving the same SYN packet multiple times.
backlog=number: Set the backlog parameter in the listen() call, which limits the maximum length of the pending connection queue. By default, the backlog is set to -1 on FreeBSD, DragonFly BSD, and macOS, and to 511 on other platforms.
rcvbuf=size: Set the receive buffer size of the listening socket (SO_RCVBUF option).
sndbuf=size: Set the send buffer size of the listening socket (SO_SNDBUF option).
accept_filter=filter: Sets the name of the accept filter (SO_ACCEPTFILTER option) for the listening socket that filters incoming connections before passing them to accept(). This only works on FreeBSD and NetBSD 5.0+. Possible values are dataready and httpready.
deferred: Indicates the use of deferred accept() on Linux (TCP_DEFER_accept socket option).
bind: Indicates a separate bind() call for the given address:port pair. This is useful because if there are multiple listening directives with the same port but different addresses, and one of the listening directives listens to all addresses on a given port (*:port), nginx will only bind() to the *:port. port. It is important to note that in this case, the getsockname() system call will be made to determine the address to accept the connection. If the setfib, fastopen, backlog, rcvbuf, sndbuf, accept_filter, deferred, ipv6only, reuseport, or so-keepalive parameters are used, then a separate bind() call will always be made for a given address:port pair.
ipv6only=on|off : Determines (via the IPV6_V6ONLY socket option) whether the IPV6 socket listening on the wildcard address [::] will only accept IPV6 connections or both IPV6 and IPv4 connections. By default, this parameter is enabled. It can only be set once at startup.
reuseport: Creates a separate listening socket for each worker process. Currently only available on Linux 3.9+, DragonFly BSD and FreeBSD 12+ (1.15.1).
so-keepalive=on|off|[keepinidle]:[keepintvl]:[keepcnt]: Configure “TCP keep-alive” behavior for listening sockets. If this parameter is omitted, the operating system’s settings will be in effect for the socket (default). Some operating systems support setting TCP keepalive parameters on a per-socket basis using the TCP_KEEPIDLE, TCP_KEPINTVL, and TCP_KEPCNT socket options. On such systems (currently, Linux 2.4+, NetBSD 5+, and FreeBSD 9.0-STABLE), they can be configured using the keepindle, keepintvl, and keepcnt parameters.
The listen parameter is relatively complex, and its role will be described in detail later on when related functions are covered.

server_name

Format: server_name name …;
Default: server_name “”;
Note: name has the following three situations:
1. name is a specific server name string, an exact match. For example, “example.com” will only match “example.com”, no matter how many letters there are.
2. The name contains the matching character “*” (matches any string), but it can only be at the beginning (“*.”) or the end ((“.*”)), and other positions are invalid. For example: \*.example.com means that all second-level domain names are matched, but note that “example.com” is not matched; special matching: “.example.com” matches both \*.example.com and “example.com”
3. name can be a PCRE regular expression and must start with “~”. For example:

server_name ~^www\d + \.example\.net$;

Notice:
1. Don’t forget to set the starting “^” and ending “$” anchor points. They are not grammatical requirements, but logical requirements.
2. Domain name “.” should be escaped with backslash.
3. { or } are special characters for configuration. If used in regular expressions, wrap them in quotation marks: “{“, “}”

location

Format: location [ = | ~ | ~* | ^~ ] uri { … }
Use the requested uri to match the uri defined by the parameter, and set instructions for the matching uri.
\=: exact match, exactly the same
~: case-sensitive matching
~*: Case-insensitive matching
^~: Prefix matching – means starting with a regular string, regular expressions are not supported
uri: regular expression. The value matched by “()” in the regular expression can be referenced through $1-$9 in the location internal instructions.

Priority

When there are multiple locations in a server, their priorities are as follows:
=Highest priority. Once the match is successful, it will no longer look for matches in other locations.
^~Sub-priority. If multiple locations match successfully, the matching process will not be terminated. The one with the longest expression will be matched and other locations will not be matched.
Last priority: non= and ^~. As long as one regular rule is successful, the location of this regular rule will be used and no other locations will be matched.

Special usage of “/”

server {
        listen 80 default_server;
        # If there is =, it means that the following "/" is a string
        location = / { #Location identifier: @1
            ...
        }
        location / { #Location identifier: @2
            ...
        }
        location /a/ { #Location identifier: @3
            ...
        }
        location /a/b/ { #Location identifier: @4
            ...
        }
    }

1. There is “=” in @1, which means that the following “/” is a string; in @2/3/4, the matching symbol is omitted (only “/” can be omitted), which means that the header matches the string starting with “/” string.
2. Matching results in the example:
http://127.0.0.1: matches @1
http://127.0.0.1/: matches @1
http://127.0.0.1/index.html: matches @2
http://127.0.0.1/a: matches @2
http://127.0.0.1/a/: matches @3
http://127.0.0.1/a/index.html: matches @3
http://127.0.0.1/a/b/index.html: Match @4 #match by longest length
http://127.0.0.1/d/a/b/index.html: Match @2 #must be a header match

root/alias/index

root

Format: root path; default root html;

Set the request root directory and add path before the request. Path values can contain variables, except \$document_root and \$realpath_root. The path to the file is constructed simply by adding a URI to the value of the root directive. If the URI must be modified, the alias directive should be used.

#When requesting "/i/top.gif", the file /data/w3/i/top.gif will be sent.
location /i/ {
    root /data/w3;
}

alias

Format: alias path; defines a replacement at the specified location. Examples are as follows:

#When requesting "/i/top.gif", the file /data/w3/images/top.gif will be sent.
location /i/ {
    alias /data/w3/images/;
}

A few notes:
1. The path value can contain variables, except $document_root and $realpath_root.
2. If you are using regular expression positioning, use () for the regular expression, and use the related $1… in the alias. Examples are as follows:

#Use regular expression positioning, $1 represents the matching content after "/users/"
location ~ ^/users/(. + \.(?:gif|jpe?g|png))$ {
    alias /data/w3/images/$1;
}

3. When the position matches the last part of the instruction value, it is best to use the root instruction instead. Examples are as follows:

#When the last part matches
location /images/ {
    alias /data/w3/images/;
}
The #root directive is replaced by the following
location /images/ {
    root /data/w3;
}

index

Format: index file …;

Define the file that will be used as the index. File names can contain variables. Files are checked in the specified order. The last element of the list can be a file with an absolute path. like:

index index.$geo.html index.0.html /index.html;

It is mainly used with root/alias. When the URI does not include specific files, use the index file.

This directive is implemented by ngx_http_index_module.

default_type

Format: default_type mime-type;
Default: default_type text/plain;

Defines the default MIME type for responses. MIME types are used to specify the type of file so that the browser can parse and display the file correctly. For MIME types, see /etc/nginx/mime.types. Commonly used methods:

 include /etc/nginx/mime.types;
    default_type application/octet-stream;

keepalive_requests

Format: keepalive_requests number; Default: keepalive_requests 1000; 100 before 1.19
Sets the maximum number of requests that can be served over a keep-alive connection. After the maximum number of requests have been made, the connection will be closed. Periodic closing of connections is necessary to free the memory allocation for each connection. Therefore, using a maximum request count that is too high may result in memory overcommitment and is not recommended.

keepalive_timeout

Format: keepalive_timeout timeout [header_timeout]; Default: keepalive_timeout 75s
The first parameter sets a timeout during which keep-alive client connections will remain open on the server side. A value of zero disables keep-alive client connections. The optional second parameter sets a value in the “Keep Alive:timeout=time” response header field. The two parameters can be different.

send_timeout

Format: send_timeout time; Default: send_timeout 60s

Sets the timeout for transmitting responses to the client. Not the entire response transfer timeout, but between two consecutive write operations. If the client does not receive anything within this time, the connection will be closed.

client_max_body_size

Format: client_max_body_size size; Default: client_max_body_size 1m;

Sets the maximum size allowed for client request bodies. If the size in the request exceeds the configured value, a 413 (Request Entity Too Large) error is returned to the client. 0 will disable checking of client request body size.

Built-in variables

Http module built-in variables can be used directly in the configuration file.
$is_args: “?” if the request line has parameters, otherwise an empty string
$arg_name: parameter name in the request line
$args: All parameters in the request line, same as $query_string
$binary_remote_addr: Client address in binary form. The length of the value is always 4 bytes for IPv4 addresses and 16 bytes for IPv6 addresses.
$body_bytes_send: Number of bytes sent to the client, excluding response headers; this variable is compatible with the “%B” parameter of the mod_log_config Apache module
$bytes_send: Number of bytes sent to the client
$connection: connection serial number
$connection_requests: The current number of requests made over the connection
$connection_time: connection time (seconds), resolution is milliseconds
$content_length: Request header field -Content-Length
$content_type: Request header field-Content-Type
$cookie_name: cookie name
$document_root: The value of the currently requested root or alias directive
$uri: The current URI in the request (without request parameters). The value of $uri may change during request processing, for example, when performing internal redirects or using index files.
$document_uri: Same as $uri
$host: In order of precedence: the hostname in the request line, or the hostname in “Host” in the request header field, or the server name matching the request line
$hostname: hostname
$http_name: Any request header field; the last part of the variable name is the field, the field name is converted to lowercase, and dashes are replaced with underscores
$https: “on” if the connection is running in SSL mode, otherwise an empty string
$limit_rate: Set this variable to limit the response rate; see limit_rate
$msec: current time (milliseconds)
$nginx_version: nginx version
$pid: PID of the worker process
$pipe: “p” if the request is pipelined, “.” otherwise
$proxy_procol_addr: Client address in the PROXY protocol header. The PROXY protocol must be pre-enabled by setting the PROXY_protocol parameter in the listening directive.
$proxy_procol_port: Client port from PROXY protocol header. The PROXY protocol must be pre-enabled by setting the PROXY_protocol parameter in the listening directive.
$proxy_procol_server_addr: The server address in the PROXY protocol header. The PROXY protocol must be pre-enabled by setting the PROXY_protocol parameter in the listening directive.
$proxy_procol_server_port:: Server port of the PROXY protocol header. The PROXY protocol must be pre-enabled by setting the PROXY_protocol parameter in the listening directive.
$realpath_root: The absolute pathname corresponding to the currently requested root or alias directive value, all symbolic links resolve to real paths
$remote_addr: client address
$remote_port: client port
$remote_user: Username provided by Basic Authentication
$request: the complete original request line
$request_body: request body. When the request body is read into the memory buffer, the value of the variable is available at the location processed by the proxy_pass, fastcgi_pass, uwsgi_pass, and scgi_pass instructions.
$request_body_file: The name of the temporary file in the request body. At the end of processing, the file needs to be deleted. To always write the request body to a file, just enable client_body_in_file_. When the name of a temporary file is passed in a proxy request or in a request to a FastCGI/uwsgi/SCGI server, passing the request body should be disabled via the proxy_pass_request_body off, FastCGI_pass_request_bbody off, uwsgi_pass_rerequest_body ff, or SCGI_pass_rerequest_bbody off directives respectively.
$request_completion: “OK” if the request has been completed, otherwise an empty string
$request_filename: The file path of the current request. Based on root or alias directives and request URI
$request_id: Unique request identifier generated from 16 random bytes, hexadecimal
$request_length: request length (including request line, headers and request body)
$request_method: request method, usually “GET” or “POST”
$request_time: Request processing time (seconds), resolution is milliseconds (1.3.9, 1.2.6); time elapsed after reading the first byte from the client
$request_uri: Complete original request URI (with parameters)
$scheme: request scheme, “http” or “https”
$sent_http_name: Any response header field; the last part of the variable name is the field name, which is converted to lowercase and dashes are replaced with underscores
$sent_trailer_name: Any field sent at the end of the response; the last part of the variable name is the field name, which is converted to lowercase and dashes are replaced with underscores
$server_addr: The address of the server that accepts requests. Calculating the value of this variable usually requires a system call. To avoid system calls, the listen directive must specify the address and use the bind parameter.
$server_name: The name of the server that accepts the request
$server_port: The port of the server that accepts requests
$server_procol: Request protocol, usually “HTTP/1.0”, “HTTP/1.1”, “HTTP/2.0” or “HTTP/3.0”
$status: response status. $tcpinfo_rtt, $tcpinfo_rttvar, $tcpinfo_snd_cwnd, $tcpinnfo_rcv_space Information about the client TCP connection; available on systems that support the TCP_INFO socket option
$time_iso8601: local time in ISO 8601 standard format
$time_local: local time in common log format

If this article is helpful or inspiring to you, please help follow or like it. If you have any questions, please comment. We will surely reply. Your support is my biggest motivation for writing!