Client -> local caching domain name server -> root domain server -> secondary domain server -> subdomain server -> resolve IP according to host name (iterative, recursive)
The parsing process is divided into iterative mode and recursive mode.
Iterative query is more commonly used, because it is not as easy to occupy the resources of the upper domain server as recursive query
Recursive pattern parsing process
The client sends a request to find the local cache domain server, if there is no
Forward relevant requests to the root domain server for resolution,
The root domain server forwards the request to the corresponding top-level domain server,
The top-level domain server then forwards it to the second-level domain server,
The second-level domain server forwards it to the sub-domain server, and resolves the IP address according to the host name.
The data is returned to the client layer by layer, and the client can use the IP address to access the host of the other party.
At the same time, the corresponding relationship of domain server records is cached locally for the next visit.
Iterative schema parsing process
The client sends a request to find the local cache domain server, if there is no
Send a resolution request to the root domain server and return the address of the top-level domain server
Send a resolution request to the top-level domain server and return the address of the second-level domain server
Send a resolution request to the secondary domain server and return the address of the subdomain server
Send a resolution request to the subdomain server, resolve the IP address and return the result to the local cache domain server
The local cache domain server returns the result to the client, and records the corresponding relationship for the next visit
TCP three-way handshake
TCP three-way handshake and four-way handshake are used to establish a connection before HTTP protocol communication.
The detailed introduction is well documented in the previous blog.
Here is a brief review of the TCP three-way handshake and four-way handshake process.
The first handshake: When the connection is established, the client sends a syn packet (seq=x) to the server, and enters the SYN_SENT state, waiting for the server to confirm
Second handshake: The server receives the syn packet and must confirm the client’s SYN (ack=x + 1), and at the same time send a SYN packet (seq =y), that is, the SYN + ACK packet, at this time the server enters the SYN_RECV state.
The third handshake: the client receives the server’s SYN + ACK package, sends a confirmation package ACK (ack=y + 1) to the server, and this package is sent After completion, the client and server enter the ESTABLISHED (TCP connection successful) state and complete the three-way handshake.
Brief description: pc1 sends a SYN message to pc2, pc2 replies a SYN + ACK message after receiving it, and pc1 replies an ACK message after receiving it, and both ends enter the established state.
TCP waved four times
Brief description: pc1 sends a FIN message to pc2, pc2 replies with an ACK message after receiving it, and then replies with a FIN + ACK message after the data transmission is completed, pc1 sends an ACK message after receiving it, and enters the time wait stage , close the connection after waiting for 2MSL. pc2 also closes the connection after receiving the ACK.
For a detailed description, please refer to the previous blog post
HTML
HTML is called Hypertext Markup Language, which is a specification and a standard, and it uses markup symbols to mark various parts of the webpage to be displayed. A web page file itself is a text file. By adding tags to the text file, you can tell the browser how to display the content.
HTML files can be edited with any text editor that can generate txt files to generate hypertext markup language files, just modify the file name suffix to “.html” or “.htm”.
HTML basic tags
(1) HTML grammar rules HTML tags adopt the form of double tags, and the front and rear tags correspond to each other, indicating the start and end of the tag respectively, and the content in the middle of the tag is described by the tag. The front tag is represented by ““, and the end tag has one more “/”, which is represented by ““. (2) HTML file structure The outermost layer of an HTML file is represented by , indicating that the file is described in HTML language. Inside it are parallel header tags (
) and content tags (). The most basic HTML file structure is as follows:
<html>
<head>Description information of the content of the web page</head>
<body>The content displayed on the webpage</body>
</html>
●Tags commonly used in header tags:
label
description
Defines the title of the document
Defines the default link address of the page link tag
Defines the relationship between a document and external resources
Defines metadata in HTML documents
Defines the script file of the client
Defines the style file of the HTML document
● Commonly used tags in content tags
label
description
Define a table
defines a row in the table
defines a column in a row in the table
Defines an image
Defines a hyperlink
Defines a line
Defines newline
Defines font
Define font size
Static and dynamic pages
1. Static pages In website design, web pages in pure HTML format are usually called "static web pages". Static web pages are standard HTML files, and their file extensions are .htm and .html. Static web pages are the basis of website construction, and early websites were generally made of static web pages. Various dynamic effects can also appear on static web pages, such as animation in .GIF format, FLASH, scrolling subtitles, etc. These "dynamic effects" are only visual, and are different concepts from the dynamic web pages that will be introduced below.
2. Dynamic pages The so-called dynamic webpage refers to a webpage programming technology that is opposite to the static webpage. For static web pages, with the generation of HTML code, the content and display effect of the page will basically not change-unless the page code is modified. This is not the case for dynamic web pages. Although the page code has not changed, the displayed content can change with time, the environment, or the results of database operations. The suffixes of dynamic webpage URLs are not .htm, .html, .shtml, .xml and other common webpage production formats for static webpages, but .aspx, .asp, .jsp, .php, .perl, .cgi, etc. , and there is an iconic symbol - "?" in the URL of the dynamic web page. Dynamic web pages are the integration of basic html grammar specifications and advanced programming languages such as Java, PHP, C#, database programming and other technologies, in order to achieve efficient, dynamic and interactive management of website content and style. Therefore, in this sense, all webpages generated by webpage programming techniques combined with advanced programming languages other than HTML and database technology are dynamic webpages.
3. Dynamic Web Language Early dynamic web pages mainly used CGI (Common Gateway Interface) technology. Although CGI technology has matured and has powerful functions, it tends to be gradually replaced by new technologies due to programming difficulties, low efficiency, and complicated modification.
Currently commonly used dynamic web page programming languages are as follows:
●PHP That is, Hypertext Preprocessor (hypertext preprocessor), which is the most popular scripting language on the Internet today. Its syntax borrows from C, Java, PERL and other languages, but you can use PHP to create a real script with very little programming knowledge. Interactive Web site.
●JSP That is, Java Server Pages (Java Server Pages), which is a new technology launched by Sun Microsystem in June 1999, is a Web development technology based on Java Servlet and the entire Java system.
●Python It is an object-oriented, cross-platform dynamic computer-like programming language. It was originally designed for writing automation scripts (shells). With the continuous updating of versions and the addition of new language features, it is increasingly used in independent Large-scale project development.
●Ruby It is a simple and fast object-oriented (object-oriented programming) scripting language developed by Japanese Yukihiro Matsumoto in the 1990s and complies with the GPL agreement and Ruby License. Its inspiration and features come from Perl, Smalltalk, Eiffel, Ada, and Lisp languages.
The difference between dynamic pages and static pages
Static webpage: It is a standard HTML file with extensions of .htm and .html, no background database support, no application program and no interactive webpage, and the URL does not contain "?" Dynamic page: Supported by a background database, the page can be interactive and automatically updated, including applications. The extension is usually suffixed in the form of .php, .jsp, .py, .ruby, .perl, etc. The iconic "?" in the page URL
Overview of the HTTP protocol
The HTTP protocol uses a request/response model. The client sends a request to the server. The request header contains the requested method, URL, protocol version, and a MIME-like message structure including request modifiers, client information, and content. The server responds with a status line containing the message protocol version, success or error encoding plus server information, entity meta information, and possibly entity content.
HTTP version (interview questions)
(1) HTTP/0.9: Obsolete. Only accepts GET as a request method, does not specify a version number in the communication, and does not support request headers. Since this version does not support the POST method, the client cannot pass much information to the server. (2) HTTP/1.0: plain text transmission; TCP connection will be closed immediately after http response; support http header information (such as http protocol version number, status code, etc.)
(3) HTTP/1.1: Persistent connections (long connections) are introduced, that is, TCP connections are not closed by default and can be multiplexed by multiple requests, which can work well with proxy servers. Also supports pipelining mechanism (implementing multiple http requests and responses in the same TCP connection)
(4) HTTP/2.0: Binary transmission; supports long connection and complete multiplexing, in a TCP connection, both the client and the browser can send multiple requests or responses at the same time, without One-to-one correspondence in order. A header information compression mechanism is introduced, which can be compressed with gzip or compress before sending. Support server-side push, allowing the server to actively send resources to the client without request.
(4) HTTP/3.0: based on UDP protocol
HTTP Methods (Interview Questions)
HTTP supports several different request commands, which are called HTTP methods. Each HTTP request message contains a method to tell the server what action to perform, including: get a page, run a gateway program, delete a file, etc. The most commonly used methods for obtaining resources are GET, POST, and PUT.
HTTP method
description
GET
Send query request resources, read or download resources, just like the select operation of the database
PUT
Submit data to the server, modify resources, and will not increase the type of data, etc. Resource updates are like database update operations
DELETE
Delete resources, just like the delete operation of the database
POST
is used to send a request containing user-submitted data Insert or create resources This request will change the type of data and other resources, just like the insert operation of the database, it will create new content. Almost all submission operations currently use POST requests.
HEAD
Request the head of the webpage Information, query meta information
GET and POST comparison
●GET method: get data from the specified server GET requests can be cached GET requests will be saved in the browser's browsing history GET requests have a length limit Mainly used to get data The query string will be displayed in the URL suffix, which is not safe, such as http://www.test.com/a.php?Id=123
●POST method: submit data to the specified server for processing POST requests cannot be cached POST requests will not be saved in the browser's browsing history POST requests have no length limit The query string will not be displayed in the URL, which is safer
HTTP Status Code(Interview Question)
HTTP Status Code (HTTP Status Code) is a 3-digit code used to indicate the HTTP response status of the web server. When the browser requests a certain URL, the server returns the corresponding processing status according to the processing status. Usually the normal status code is 2xx, 3xx (such as 200), if there is an exception, it will return 4xx, 5xx (such as 404).
The first position of the status code
Defined range
Classification
1xx
100-101
Information Tips
2xx
200-206
Success
3xx
300-305
Redirect
4xx
400-415
Client Error
5xx
500-505
Server Error
HTTP Common Status Codes (Interview Questions)
Status Code
Function Description
200
Everything is OK
301
Permanent Redirect
302
Temporary Redirect
401
User Name or password error
403
Access denied, client IP/host name blocked
404
The requested file does not exist, the URL path requested by the client is wrong, and the corresponding path of the server does not have the webpage file
td>
414
The request URI header is too long
500
inside the server
502
Bad Gateway, the proxy server receives the message from Invalid/error response from end server, gateway replies to client 502
503
current
504
Gateway request timed out caused by service unavailable, server overload or maintenance (downtime) , the proxy server cannot receive the response from the backend server within the specified time (response timeout), and the gateway replies to the client with 504
HTTP request process analysis
When the user enters the URL in the browser to access, an HTTP request message is initiated, and the request includes the request line, request header, and request body. After receiving the request, the server returns a response message, including the status line, response header, and response body.
HTTP protocol message format
1. Request message
Request line: The request line consists of three parts: Request method, URL and protocol version. Request Header: Meta information of key-value type. The request header adds some additional information to the request message, consisting of "name/value" pairs, one pair per line, and the name and value are separated by a colon. Blank line: There will be a blank line at the end of the request header, indicating the end of the request header, followed by the request body. This line is very important and essential. Request body: The request body is the parameter submitted by the request. The GET method has specified the parameters in the URL, so there is no data when submitting. The parameters submitted by the POST method are in the request body.
request header
description
Host
The address of the server that accepts the request, it can be IP: port number, or domain name
User-Agent
The name of the application that sent the request
Connection
Specify the properties related to the connection, Such as Connection:Keep-Alive
Accept-Charset
Notify the server of the encoding format that can be sent
Accept-Encoding
Notify the server of the data compression format that can be sent
Accept-Language
Notify the server that it can send Language
2. Response message
Status line: The status line consists of three parts: protocol version, status code, and status code description. Response header: Meta information of key-value type. The response header is similar to the request header, adding some additional information to the response message. Blank line: There will be a blank line at the end of the response header, indicating the end of the response header. Response body: The server returns the corresponding HTML data, and the browser displays the page after parsing it.