Several issues about network protocols (3)

1. When there is a problem with the sent message, an ICMP error message will be sent to report the error. But what if there is also a problem with the ICMP error message?

Answer: The following will not cause ICMP error messages:

  • ICMP error messages (ICMP query messages may generate ICMP error messages);
  • IP datagrams whose destination address is a broadcast address or a multicast address;
  • As a datagram broadcast by the link layer;
  • Not the first piece of IP fragmentation;
  • The source address of the datagram is not a single host. This means that the source address cannot be a zero address, a loopback address, a broadcast address, or a multicast address.

2. What network programming interface does ping use?

Answer: The network programming interface is Socket. For ping, ICMP is used. Create Socket as follows:

socket(AF_INET, SOCK_RAW, IPPROTO_ICMP)

SOCK_RAW establishes a communication mechanism based on the IP layer protocol.

If it is TCP, establish the following Socket:

socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)

If it is UDP, establish the following Socket:

socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP)

3. Who sent the ICMP error message?

Answer:ICMP packets are returned by the kernel. In the kernel, there is a function for sending ICMP packets.

void icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info);

For example, if the target is unreachable, the following function will be called.

icmp_send(skb, ICMP_DEST_UNREACH, ICMP_PROT_UNREACH, 0);

When the IP size exceeds the MTU, ICMP that requires fragmentation is sent.

if (ip_exceeds_mtu(skb, mtu)) {
  icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu));
  goto drop;
 }

4. How many connections can NAT establish?

Answer: SNAT is mostly used in scenarios where the internal network accesses the external network. Since conntrack is determined by hashing {source IP, source port, target IP, target port}.

If there are many machines on the intranet, but they access different external networks, that is, there are many target IPs and target ports, the number that the intranet can carry will be very large, and it can be more than 65535.

However, if all machines in the intranet must access the same target IP and target port, if there is only one source IP, in this case, it will be limited by the number of ports 65535. According to the principle, one method is to use multiple sources IP, another method is multiple NAT gateways to share access from different intranet machines.

If you are using a public cloud, 65535 machines should be placed in a VPC. They can be placed in multiple VPCs. Each VPC can have its own NAT gateway.

5. Do the public IP and private IP need to be bound one by one?

Answer: Public network IP is limited. If you use the public cloud, you need to spend money to buy it. But not every virtual machine must have a public IP. Only machines that need to provide services to the outside world, that is, those nginx at the access layer need a public IP. If there is no public IP, SNAT is used, and everyone shares the public IP of the SNAT gateway. The network IP address is also accessible to the external network.

6. Routing protocols need to exchange information between routers. Does the exchange of this information still require routing? Isn’t it a deadlock?

Answer: OSPF is sent directly based on the IP protocol, and OSPF packets are sent to neighbors, that is, there is only one hop and will not pass through routing devices in the middle. BGP is based on the TCP protocol and exchanges information between BGP peers.

7. What’s going on with the multi-line BGP computer room?

Answer: BGP is mainly used for interconnection between Internet AS autonomous systems. The main function of BGP is to control the propagation of routes and select the best route. All major operators have AS numbers, and most of the major network operators across the country implement multi-line interconnection through the BGP protocol and their own ASs.

To use this solution to achieve multi-line interconnection, IDC needs to apply for its own IP address segment and AS number from CNNIC (China Internet Information Center) or APNIC (Asia Pacific Network Information Center), and then broadcast this IP address segment to other devices through the BGP protocol. in the network operator’s network.

After interconnection using the BGP protocol, all backbone routing devices of the network operator will determine the best route to the IP segment of the IDC computer room to ensure high-speed access for users of different network operators.

8. TCP connections have so many states. Do you know how to check the status of a certain connection in the system?

Answer: You can use the netstat or lsof command to grep establish listen close_wait, etc. to check.

9. What’s the matter with too many TIME_WAIT statuses?

Answer: If it is in the TIMEWAIT state, it means that the two parties have successfully established a connection and the last ACK has been sent before it will be in this state, and it is the party that actively initiated the shutdown.

If there are a large number of TIMEWAIT, it is often because there are too many short connections, connections are constantly created and then released, resulting in many connections in this state, which may make it impossible to initiate new connections. The solution is often:

  • Turn on tcp_tw_recycle and tcp_timestamps options;
  • Turn on the tcp_tw_reuse and tcp_timestamps options;
  • SO_LINGER is used in the program, and the application is forced to close using rst.

When the client receives a Connection Reset, it often receives a TCP RST message. The RST message is generally sent under the following circumstances:

  • Attempting to connect to a server that is not listening;
  • The other party is in the TIMEWAIT state, or the connection has been closed and is in the CLOSED state, or the re-monitoring seq num does not match;
  • Timeout when initiating connection, retransmission timeout, keepalive timeout;
  • Use SO_LINGER in the program. When closing the connection, discard the data in the cache and send RST to the other party.

10. How is the starting sequence number calculated? Will there be any conflict?

Answer: The starting ISN is clock-based and increments by one every 4 milliseconds, taking 4.55 hours to complete one revolution.

The TCP initialization sequence number cannot be set to a fixed value, because it is easy for an attacker to guess the subsequent sequence number and be attacked. A better random generation algorithm for initialization sequence number ISN is proposed in RFC1948.

ISN = M + F (localhost, localport, remotehost, remoteport)

M is a timer that increments by 1 every 4 milliseconds. F is a Hash algorithm that generates a random value based on the source IP, destination IP, source port, and destination port. To ensure that the Hash algorithm cannot be easily calculated by the outside world, using the MD5 algorithm is a better choice.

11. epoll is a function on Linux, but do you know what the corresponding mechanism is on Windows? If you want to implement a cross-platform program, do you know what to do?

Answer:epoll is an asynchronous notification. When an event occurs, the application is notified to call the IO function to obtain data. IOCP asynchronous transmission. When an event occurs, the IOCP mechanism will copy the data directly to the buffer, and the application can use it directly.

If it is cross-platform, it is recommended to use the libevent library. It is an event notification library, suitable for Windows, Linux, BSD and other platforms. It internally uses system calls such as select, epoll, kqueue, IOCP and other system calls to manage event mechanisms.

12. How does ping work?

Answer: A connection can be uniquely identified through the content in nf_conntrack_tuple:

src: contains the source IP address; if it is TCP or UDP, it contains the source port; if it is ICMP, it contains the ID;

dst: contains the target IP address; if it is TCP or UDP, it contains the target port; if it is ICMP, it contains type, code.

This article is a study note for Day 12 in October. The content comes from Geek Time’s “Internet Protocol”. This course is recommended.