Network Programming Sockets (3) – Protocol Customization | Serialization and Deserialization

Article directory

  • 1. Understand “agreement”
    • 1. The concept of agreement
    • 2.Transmission of structured data
    • 3. Serialization and deserialization
  • 2. Online calculator
    • 1. Server
    • 2.Protocol customization
      • (1) Correct understanding of network sending and reading
      • (2) Issues with protocol customization
    • 3.Client
    • 4.Code
  • 3. Json implements serialization and deserialization
    • 1. Brief introduction
    • 2.Use

1. Understanding “agreement”

1. The concept of protocol

Protocol, short for network protocol, is a set of agreements that both communicating computers must abide by, such as how to establish a connection, how to identify each other, etc.

In order for data to reach the destination from the source on the network, the participants in the network communication must follow the same rules. We call this set of rules a protocol, and the protocol ultimately needs to be expressed in a computer language. Only when both communicating computers comply with the same protocol can computers communicate with each other.

2. Transmission of structured data

When communicating on the network, both parties:

  • If the data to be transmitted is a string, then send the string directly to the network. At this time, the other end can also obtain the string from the network.
  • But if what needs to be transmitted is some structured data, the data cannot be sent to the network one by one at this time.

For example, if you want to implement an online version of a calculator, each request data sent by the client to the server needs to include the left operand, the right operand and the corresponding operations that need to be performed. At this time, what the client wants to send is Not a simple string, but a set of structured data.

If the client sends these structured data to the network one by one, then the server can only obtain the data one by one from the network. At this time, the server still needs to figure out how to combine the received data. . Therefore, it is best for the client to package these structured data and send them to the network. At this time, what the server obtains from the network is a complete request data each time. There are two common “packaging” methods for the client: .

Combine structured data into a string

Agreed plan one:

  • The client sends a string of the form “1 + 1”.
  • There are two operands in this string, both of which are integers.
  • There will be a character between the two numbers that is an operator.
  • There are no spaces between numbers and operators.

The client can combine these structured data into a string in some way, and then send this string to the network. At this time, the server gets such a string from the network every time, and then the server Then parse the string in the same way, and the server can extract the structured data from the string.

Custom structure + serialization and deserialization

Agreed plan two:

  • Customize the structure to represent the information that needs to be interacted with.
  • When sending data, this structure is converted into a network standard data format according to a rule, and when receiving data, the received data is converted into a structure according to the same rules.
  • This process is called “serialization” and “deserialization”.

The client can customize a structure and define the information that needs to be interacted with into this structure. When the client sends data, it first serializes the data, and after the server receives the data, it deserializes it. At this time, the server can get the structure sent by the client, and then extract the corresponding structure from the structure. Information.

3. Serialization and deserialization

Serialization and deserialization:

  • Serialization is the process of converting an object’s state information into a form (a sequence of bytes) that can be stored or transmitted.
  • Deserialization is the process of restoring a sequence of bytes into an object.

The function of the presentation layer in the OSI seven-layer model is to realize the conversion between the device’s inherent data format and the network standard data format. The inherent data format of the device refers to the format of the data at the application layer, while the network standard data format refers to the data format that can be transmitted over the network after serialization.

Purpose of serialization and deserialization

  • During network transmission, the purpose of serialization is to facilitate the sending and receiving of network data. No matter what type of data it is, it becomes a binary sequence after serialization. At this time, the bottom layer sees the unity when transmitting network data. They are all binary sequences.
  • The serialized binary sequence can only be recognized by the bottom layer during network transmission. The upper-layer application cannot recognize the serialized binary sequence. Therefore, it is necessary to deserialize the data obtained from the network and convert the binary sequence data. into a data format that the application layer can recognize.

We can think that network communication and business processing are at different levels. When conducting network communication, what the bottom layer sees is binary sequence data, while when conducting business processing, what it sees is data that can be recognized by the upper layer. If data needs to be converted between business processing and network communication, corresponding serialization or deserialization operations need to be performed on the data.

2. Online Calculator

We need to implement a server version of the calculator. The client sends the two numbers to be calculated and the calculation type, and then the server performs the calculation, and finally returns the result to the client.

1. Server

Server creation steps:

  • Call socket to create a socket
  • Call bind to bind the port
  • Call listen to set the socket status to listening
  • Call accept to get a new connection
  • Dealing with reading and writing issues (key points)

2. Protocol customization

(1) Correct understanding of network sending and reading

When the client communicates with the server, the read and write functions are called. Do they send data directly to the opposite end? no

  • TCP protocol has its own send buffer and receive buffer
  • The essence of calling write: copy the data corresponding to the user to the TCP sending buffer
  • The essence of calling read: copy data from the receive buffer to the user layer
  • So the essence of read and write is copy function
  • After copying the data to the TCP send buffer, TCP determines how to send the remaining data, so TCP is also called the Transmission Control Protocol.
  • Because sending and receiving are in pairs and can be performed at the same time, the TCP protocol is full-duplex.

To sum up:

  • The essence of TCP communication is to copy the data in one’s own sending buffer to the other party’s receiving buffer through the network.
  • The essence of network communication is also copying

(2) Problems with protocol customization

Before customizing the protocol, we must first solve a problem. When using the TCP protocol, we simply read it without considering that TCP is byte stream-oriented and the read data is incomplete. The same problem exists here. If the other party sends a lot of messages at once, and these messages are accumulated in the TCP receive buffer, how can you ensure that you read a complete message?

We use this approach:

  • Fixed length of message
  • Use special symbols (add special symbols between messages)
  • Self-describing method (design your own protocol)

Agreement design format:

Protocol.hpp

#include<string>
#include<iostream>
#include<vector>
#include<cstring>
#include<sys/types.h>
#include<sys/socket.h>
#include"Util.hpp"
using namespace std;

// Customize the protocol for the network version of the calculator
namespace Protocol_ns
{<!-- -->

    #define SEP " "
    #define SEP_LEN strlen(SEP) // Must not be written as sizeof
    #define HEADER_SEP "\r\
"
    #define HEADER_SEP_LEN strlen("\r\
")

    // "length"\r\
" "_x op _y"\r\

    // "10 + 20" => "7"r\
""10 + 20"\r\
 => header + payload
    // Request/Response = Headers\r\
Payload\r\

    // request = header\r\
payload\r\
header\r\
payload\r\
header\r\
payload\r\


    // "10 + 20" => "7"r\
""10 + 20"\r\

    string AddHeader(string &str)
    {<!-- -->
        cout<<"AddHeader before:\
"
            <<str<<endl;

        string s=to_string(str.size());
        s + =HEADER_SEP;
        s + =str;
        s + =HEADER_SEP;

        cout<<"After AddHeader:\
"
            <<s<<endl;

        return s;
    }

    // "7"r\
""10 + 20"\r\
 => "10 + 20"
    string RemoveHeader(const string & amp;str,int len)
    {<!-- -->
        cout<<"RemoveHeader before:\
"
            <<str<<endl;

        // Intercept from the back
        string res=str.substr(str.size()-HEADER_SEP_LEN-len,len);

        cout<<"After RemoveHeader:\
"
            <<res<<endl;

        return res;
    }

    int Readpackage(int sock,string & amp;inbuffer,string*package)
    {<!-- -->
        cout<<"ReadPackage inbuffer before:\
"
            <<inbuffer<<endl;

        // Read while reading
        char buffer[1024];
        ssize_t s=recv(sock, & amp;buffer,sizeof(buffer)-1,0);
        if(s<=0)
            return -1;

        buffer[s]=0;
        inbuffer + =buffer;

        cout<<"ReadPackage inbuffer:\
"
            <<inbuffer<<endl;

        // Edge analysis, "7"r\
""10 + 20"\r\

        auto pos=inbuffer.find(HEADER_SEP);
        if(pos==string::npos)
            return 0;
        
        string lenStr=inbuffer.substr(0,pos); // Get the header string, inbuffer is not touched
        int len=Util::toInt(lenStr); // Get the length of the payload => "123" -> 123
        int targetPackageLen=len + 2*HEADER_SEP_LEN + lenStr.size(); // Get the entire packet length
        if(inbuffer.size()<targetPackageLen) // Not a complete packet
            return 0;
        
        *package=inbuffer.substr(0,targetPackageLen); // The message payload is extracted, the inbuffer is not touched
        inbuffer.erase(0,targetPackageLen); // Remove the entire packet directly from the inbuffer

        cout<<"After ReadPackage inbuffer:\
"
            <<inbuffer<<endl;

        return len;
    }

    // Request & amp; & amp; Response must provide serialization and deserialization functions
    // 1. Handwrite by yourself
    // 2. Use other people's --- json, xml, protobuf

    class Request
    {<!-- -->
    public:

        Request()
        {<!-- -->

        }

        Request(int x,int y,char op)
            :_x(x)
            ,_y(y)
            ,_op(op)
        {<!-- -->
        }

        // Serialization: struct->string
        bool Serialize(string* outStr)
        {<!-- -->
            *outStr="";
            string x_string=to_string(_x);
            string y_string=to_string(_y);

            // Manual serialization
            *outStr=x_string + SEP + _op + SEP + y_string;

            std::cout << "Request Serialize:\
"
                      << *outStr << std::endl;

            return true;
        }

        //Deserialization: string->struct
        bool Deserialize(const string & amp;inStr)
        {<!-- -->
            // inStr: 10 + 20 => [0]=>10, [1]=> + , [2]=>20
            vector<string> result;
            Util::StringSplit(inStr,SEP, & amp;result);

            if(result.size()!=3)
                return false;
            if(result[1].size()!=1)
                return false;

            _x=Util::toInt(result[0]);
            _y=Util::toInt(result[2]);
            _op=result[1][0];

            return true;
        }

        ~Request()
        {<!-- -->
            
        }

    public:
        // _x op _y ==> 10 * 9 ? ==> 10 / 0 ?
        int _x;
        int _y;
        char_op;
    };

    class Response
    {<!-- -->
    public:
        Response()
        {<!-- -->

        }
        
        Response(int result,int code)
            :_result(result)
            ,_code(code)
        {<!-- -->
            
        }

        // Serialization: struct->string
        bool Serialize(string* outStr)
        {<!-- -->
            // _result _code
            *outStr="";
            string res_string = to_string(_result);
            string code_string = to_string(_code);

            // Manual serialization
            *outStr=res_string + SEP + code_string;

            return true;
        }

        //Deserialization: string->struct
        bool Deserialize(const string & amp;inStr)
        {<!-- -->
            // 10 0
            vector<string> result;
            Util::StringSplit(inStr,SEP, & amp;result);

            if(result.size()!=2)
                return false;

            _result=Util::toInt(result[0]);
            _code=Util::toInt(result[1]);
            return true;
        }

        ~Response()
        {<!-- -->

        }

    public:
        int _result;
        int _code; // 0 success; 1,2,3,4 represent different error codes
    };

}

Util.hpp

#pragma once

#include<iostream>
#include<string>
#include<vector>
using namespace std;

class Util
{<!-- -->
public:
    // Input: const & amp;
    // Output: *
    // input Output: *
    static bool StringSplit(const string & amp;str, const string & amp;sep, vector<string> *result)
    {<!-- -->
        // 10 + 20
        size_t start = 0;

        while (start < str.size())
        {<!-- -->
            auto pos = str.find(sep, start);
            if (pos == string::npos)
                break;

            result->push_back(str.substr(start, pos - start));

            // update location
            start = pos + sep.size();
        }

        // Process the last string
        if(start<str.size())
            result->push_back(str.substr(start));

        return true;
    }

    static int toInt(const string & amp;s) // Convert string to integer
    {<!-- -->
        return atoi(s.c_str());
    }
};

3. Client

Client creation steps:

  • Call socket to create a socket
  • The client does not need to bind the port itself
  • Call connect to connect to the server
  • Handling read and write issues

4. Code

Source code address

Three.Json implements serialization and deserialization

1. Brief introduction

The above is our own customized protocol to implement serialization and deserialization. Below we use some ready-made solutions to implement serialization and deserialization. Commonly used in C++: protobuf and json, simple json is used here.

JSON (JavaScript Object Notation) is a lightweight data exchange format. Easy for humans to read and write. It is also easy for machines to parse and generate. JSON uses a completely language-independent text format, but also uses conventions similar to the C language family (including C, C++, Java, JavaScript, Perl, Python, etc.). These properties make JSON an ideal data exchange language. Json data consists of key-value pairs, curly brackets represent objects, and square brackets represent arrays.

2.Use

  • Install json library
yum install -y jsoncpp-devel
  • Use the header file included by json:
#include <jsoncpp/json/json.h>

Note that the makefile must contain the name of the Json library

We directly create Json objects for serialization and deserialization when using them.

#include<string>
#include<iostream>
#include<vector>
#include<cstring>
#include<sys/types.h>
#include<sys/socket.h>
#include"Util.hpp"
#include<jsoncpp/json/json.h>
using namespace std;

// #define MYSELF 1

// Customize the protocol for the network version of the calculator

namespace Protocol_ns
{<!-- -->

    #define SEP " "
    #define SEP_LEN strlen(SEP) // Must not be written as sizeof
    #define HEADER_SEP "\r\
"
    #define HEADER_SEP_LEN strlen("\r\
")

    // "length"\r\
" "_x op _y"\r\

    // "10 + 20" => "7"r\
""10 + 20"\r\
 => header + payload
    // Request/Response = Headers\r\
Payload\r\

    // request = header\r\
payload\r\
header\r\
payload\r\
header\r\
payload\r\


    // Future: "length"\r\
"protocol number\r\
""_x op _y"\r\

     

    // "10 + 20" => "7"r\
""10 + 20"\r\

    string AddHeader(string &str)
    {<!-- -->
        cout<<"AddHeader before:\
"
            <<str<<endl;

        string s=to_string(str.size());
        s + =HEADER_SEP;
        s + =str;
        s + =HEADER_SEP;

        cout<<"After AddHeader:\
"
            <<s<<endl;

        return s;
    }


    // "7"r\
""10 + 20"\r\
 => "10 + 20"
    string RemoveHeader(const string & amp;str,int len)
    {<!-- -->
        cout<<"RemoveHeader before:\
"
            <<str<<endl;

        // Intercept from the back
        string res=str.substr(str.size()-HEADER_SEP_LEN-len,len);

        cout<<"After RemoveHeader:\
"
            <<res<<endl;

        return res;
    }


    int Readpackage(int sock,string & amp;inbuffer,string*package)
    {<!-- -->
        cout<<"ReadPackage inbuffer before:\
"
            <<inbuffer<<endl;

        // Read while reading
        char buffer[1024];
        ssize_t s=recv(sock, & amp;buffer,sizeof(buffer)-1,0);
        if(s<=0)
            return -1;

        buffer[s]=0;
        inbuffer + =buffer;


        cout<<"ReadPackage inbuffer:\
"
            <<inbuffer<<endl;


        // Edge analysis, "7"r\
""10 + 20"\r\

        auto pos=inbuffer.find(HEADER_SEP);
        if(pos==string::npos)
            return 0;
        
        string lenStr=inbuffer.substr(0,pos); // Get the header string, inbuffer is not touched
        int len=Util::toInt(lenStr); // Get the length of the payload => "123" -> 123
        int targetPackageLen=len + 2*HEADER_SEP_LEN + lenStr.size(); // Get the entire packet length
        if(inbuffer.size()<targetPackageLen) // Not a complete packet
            return 0;
        
        *package=inbuffer.substr(0,targetPackageLen); // The message payload is extracted, the inbuffer is not touched
        inbuffer.erase(0,targetPackageLen); // Remove the entire packet directly from the inbuffer

        cout<<"After ReadPackage inbuffer:\
"
            <<inbuffer<<endl;

        return len;
    }



    // Request & amp; & amp; Response must provide serialization and deserialization functions
    // 1. Handwrite by yourself
    // 2. Use someone else’s

    class Request
    {<!-- -->
    public:

        Request()
        {<!-- -->

        }

        Request(int x,int y,char op)
            :_x(x)
            ,_y(y)
            ,_op(op)
        {<!-- -->

        }


        // Serialization: struct->string
        bool Serialize(string* outStr)
        {<!-- -->
            *outStr="";
#ifdef MYSELF
            string x_string=to_string(_x);
            string y_string=to_string(_y);

            // Manual serialization
            *outStr=x_string + SEP + _op + SEP + y_string;

            std::cout << "Request Serialize:\
"
                      << *outStr << std::endl;
#else
            Json::Value root; // Value: a universal object, accepting any kv type
            root["x"]=_x;
            root["y"]=_y;
            root["op"]=_op;

            // Json::FastWriter writer; // writer: is used for serialization struct -> string
            Json::StyledWriter writer;

            *outStr=writer.write(root);
#endif
            return true;
        }

        //Deserialization: string->struct
        bool Deserialize(const string & amp;inStr)
        {<!-- -->
#ifdef MYSELF
            // inStr: 10 + 20 => [0]=>10, [1]=> + , [2]=>20
            vector<string> result;
            Util::StringSplit(inStr,SEP, & amp;result);

            if(result.size()!=3)
                return false;
            if(result[1].size()!=1)
                return false;

            _x=Util::toInt(result[0]);
            _y=Util::toInt(result[2]);
            _op=result[1][0];

#else
            Json::Value root;
            Json::Reader reader; // Reader: is used for deserialization
            reader.parse(inStr,root);

            _x=root["x"].asUInt();
            _y=root["y"].asUInt();
            _op=root["op"].asUInt();
    
#endif
            Print();

            return true;
        }

        void Print()
        {<!-- -->
            std::cout << "_x: " << _x << std::endl;
            std::cout << "_y: " << _y << std::endl;
            std::cout << "_z: " << _op << std::endl;
        }

        ~Request()
        {<!-- -->
            
        }

    public:
        // _x op _y ==> 10 * 9 ? ==> 10 / 0 ?
        int _x;
        int _y;
        char_op;

    };

    class Response
    {<!-- -->
    public:
        Response()
        {<!-- -->

        }

        
        Response(int result,int code)
            :_result(result)
            ,_code(code)
        {<!-- -->
            
        }


        // Serialization: struct->string
        bool Serialize(string* outStr)
        {<!-- -->
            // _result _code
            *outStr="";
#ifdef MYSELF
            string res_string = to_string(_result);
            string code_string = to_string(_code);

            // Manual serialization
            *outStr=res_string + SEP + code_string;

#else
            Json::Value root;
            root["result"]=_result;
            root["code"]=_code;
            // Json::FastWriter writer;
            Json::StyledWriter writer;

            *outStr=writer.write(root);
#endif
            return true;
        }

        //Deserialization: string->struct
        bool Deserialize(const string & amp;inStr)
        {<!-- -->
#ifdef MYSELF
            // 10 0
            vector<string> result;
            Util::StringSplit(inStr,SEP, & amp;result);

            if(result.size()!=2)
                return false;

            _result=Util::toInt(result[0]);
            _code=Util::toInt(result[1]);

#else
            Json::Value root;
            Json::Reader reader;
            reader.parse(inStr, root);

            _result = root["result"].asUInt();
            _code = root["code"].asUInt();
#endif
            Print();
            return true;
        }

        void Print()
        {<!-- -->
            std::cout << "_result: " << _result << std::endl;
            std::cout << "_code: " << _code << std::endl;
        }

        ~Response()
        {<!-- -->

        }

    public:
        int _result;
        int _code; // 0 success; 1,2,3,4 represent different error codes
    };
}

This article ends here. Coding is not easy, so please support me! ! !