C++ implements structure serialization and deserialization

1. What is structure serialization and deserialization

Serialization: It is the process of converting objects into byte sequences.

Deserialization: It is the process of converting byte sequences into objects.

The byte sequence serialized into an object will contain the object’s type information, object data, etc. To put it bluntly, it contains all the information describing the object. Based on this information, an object that is exactly the same as the original can be “reproduced”.

To put it simply, structure serialization is to convert the data in this structure into a common piece of data. After another structure of the same type gets this data, it can deserialize it and get the value of each structure member.

#include <iostream>
#include <memory>
using namespace std;
struct A
{
    int a;
    int b;
};
void test(const char* s, int len)
{
    A b;
    memcpy((char *) &b, s, len);
    cout<<b.a<<" "<<b.b;
}
int main(int argc, char *argv[])
{
    A a;
    a.a = 20;
    a.b = 30;
    test((const char *) & amp;a, sizeof(a));
    return 0;
}

In the above code, the data of a is copied to b. Of course, a simple structure like A can be copied directly =. However, if the structure needs to be passed to multiple places in the program, you need to consider using a more general data type in the parameter passing process. In the above code, you can copy the data to the C++ String container first, then transfer it, and then copy the data in the String to the structure when using it. This completes a serialization and deserialization.

2. Why do we need structure serialization and deserialization

2.1 Data persistence

Usually the data of the structure instances we create is saved in the heap area or stack area. The stack area data will be automatically released after the program ends. In order to avoid memory leaks, we must manually release the applied heap area memory. Then if you want to get the structure data of the previous program in a new program, you need to serialize the data into a byte sequence, then store it on the hard disk, and then use the byte sequence in the new program to deserialize it. An identical structure instance.

2.2 Avoid data copy failure caused by memory alignment

#include <iostream>
using namespace std;
#pragma pack (4) //Specify the number of bytes for memory alignment
struct A
{
    int a;
    double b;
    int c;

};
int main()
{
    cout<<sizeof(A);
    return 0;
}

The specified variable is forced to be aligned by 4 bytes, then the size of A is 16

#include <iostream>
using namespace std;
#pragma pack (8) //Specify the number of bytes for memory alignment
struct A
{
    int a;
    double b;
    int c;

};
int main(int argc, char *argv[])
{
    cout<<sizeof(A);
    return 0;
}

If the specified variable is forced to be aligned by 8 bytes, the size of A becomes 24.

Why does this happen? This involves the issue of memory alignment. This article will not focus on the principles of memory alignment. Please see the following articles for details. Easily understand memory alignment in one article – Zhihu (zhihu.com)

If the memory alignment rules of the environment where the program is located are inconsistent during network transmission, the calculated memory size occupied by the structure will be inconsistent. Direct memory copying of the entire structure will result in more or less copies of a few bytes. This is actually very dangerous and can easily lead to memory corruption.

For solutions, you can refer to Json serialization and XM serialization, which are common in network transmission. Both of them serialize objects into strings and then transmit them. During deserialization, they parse out the corresponding data in sequence according to certain rules. Although converting to a string has good readability, it does greatly increase the length of the transmitted data. In fact, we can copy each structure member variable to a cache in turn, and then convert it into other general data structures for transmission.

#include <iostream>
#include <memory>
#include <stdlib.h>
#include <cstdio>
#include <cstring>
#include<string>
using namespace std;

classCSeArchive
{
public:
    CSeArchive(int a_buf_size = 1024 * 5)
    {
        buf = new char[a_buf_size];
        memset(buf,0x00,a_buf_size);
        bufptr = buf;
        buf_size = a_buf_size;
        data_size = 0;

    }
    CSeArchive( const char *a_buf, int a_buf_size)
    {
        buf = new char[1024 * 5];
        memcpy(buf,a_buf, a_buf_size);
        bufptr = buf;
        buf_size = 1024 * 5;
        data_size = a_buf_size;
    }
    ~CSeArchive()
    {
        if(sizeof(buf) != 0)
        {
            delete[] buf;
            buf = nullptr;
            bufptr = nullptr;
        }

    }
    char* get_buf(std::string & strData) //Get the cache pointer
    {
        strData = std::string(buf,data_size);
        return buf;
    }

    inline CSeArchive & amp; operator<<(int i) //Write int data into cache memory
    {
        memcpy(bufptr,(char*) & amp;i,sizeof(i));
        bufptr + = sizeof(i);
        data_size + = sizeof(i);
        return *this;
    }
    inline CSeArchive & amp; operator<<(double i) //Add double data to cache
    {
        memcpy(bufptr,(char*) & amp;i,sizeof(i));
        bufptr + = sizeof(i);
        data_size + = sizeof(i);
        return *this;
    }
    inline CSeArchive & amp; operator>>(int & amp; i) //Read int data from cache
    {
        memcpy((char *) & amp;i, bufptr, sizeof(i));
        bufptr + = sizeof(i);
        return *this;
    };

    inline CSeArchive & amp; operator>>(double & amp;i) //Read double data from cache
    {
        memcpy((char *) & amp;i, bufptr, sizeof(i));
        bufptr + = sizeof(i);
        return *this;
    }
private:
    char *buf, *bufptr;
    int buf_size, data_size;
};

struct A
{
    int a;
    double b;
    int c;
    string Serialize()
    {
        CSeArchive cs;
        cs<<a<<b<<c;
        string ret;
        cs.get_buf(ret);
        return ret;
    }

    void UnSerialize(string s)
    {
        CSeArchive cs(s.c_str(),s.size());
        cs>>a>>b>>c;
    }
};

int main()
{
    A a;
    a.a = 10;
    a.b = 20.5;
    a.c = 30;
    A b;
    string s = a.Serialize();
    b.UnSerialize(s);
    cout<<b.a<<" "<<b.b<<" "<<b.c;
    return 0;
}

2.3 Copying complex structures

When the structure contains objects or containers, memory cannot be copied directly. In this case, we can copy through serialization and deserialization. The implementation method is also very simple. Just overload the << operator to write data to the cache, and overload the >> operator to read the data.

2.3.1 Support array

template //Write into array
inline CSeArchive & amp; operator<<(T( & amp;m)[N])

{
return wr_array(m, N);
}

template

inline CSeArchive & amp; wr_array(T arData[], uint32 uCount) //Write array
{
for(int i = 0;i < uCount; i + + )
{
memcpy(bufptr,arData + i,sizeof(arData[0]));
bufptr + = sizeof(arData[0]);
data_size + = sizeof(arData[0]);
}
return *this;
}

template //Read the array

inline CSeArchive & amp; operator>>(T( & amp;m)[N]) {return rd_array(m, N);}

template
inline CSeArchive & rd_array(T arData[], uint32 uCount)

{
for(int i = 0;i < uCount; i + + )
{
memcpy(arData + i,bufptr, sizeof(arData[0]));
bufptr + = sizeof(arData[0]);
}
return *this;

}

2.3.2 supports string container

int CSeArchive::getStrlen(char *buf) //Find the distance between the current pointer and the next ‘\0’
{
if(buf == nullptr)
return 0;
int count = 0;
while(*buf != ‘\0’ & amp; & amp; buf != nullptr & amp; & amp; count < data_size)
{
buf + + ;
count + + ;
}
return count;
}

CSeArchive & amp; operator <<(const std::string & amp; strValue) //Overloaded operator (write string)

{
memcpy(bufptr,strValue.data(),strValue.size());
bufptr + = strValue.size() + 1;
data_size + = strValue.size() + 1;
return *this;
}

CSeArchive & amp; operator >> (std::string & amp; strValue) //Overloaded operator (read string)

{

int len = getStrlen(bufptr);

strValue = std::string(bufptr,len);
bufptr + = len + 1;
return *this;
}

2.3.3 supports vector containers (other containers can draw inferences)

//Write vector type

inline CSeArchive & amp; operator<<(vector & amp;vtData) {return wr_array(vtData);}

inline CSeArchive & amp; wr_array(vector & amp;vtData)
{
(*this)<<(int)vtData.size();

for(int i = 0;i < vtData.size();i + + )
(*this)< return *this;
}

//Read the vector type

inline CSeArchive & amp; operator>>(vector & amp;vtData) {return rd_array(vtData);}

inline CSeArchive & amp; rd_array(vector & amp;vtData)
{
int size = 0;
(*this)>>size;
for(int i = 0;i < size;i + + )
{
string data = “”;
(*this)>> data;
vtData.push_back(data);
}
return *this;
}

2.3.4 Support objects

template::value > * = nullptr >
inline CSeArchive & amp; operator<<(T tParam) //Pass in object
{
std::string tString = tParam.Serialize(); //The object is serialized and stored in the cache
wr_array((char*)tString.c_str(), tString.length());
return *this;
}
template::value > * = nullptr >
inline CSeArchive & amp; operator>>(T & amp; tParam) //Outgoing object (initialization)
{
tParam.T::UnSerialize(*this);
return *this;
}

3. Specific implementation

Please get the specific implementation code yourself. https://gitee.com/Lanlvzzz/serialize.git