In-depth explanation of custom types (structures, enumerations, unions)

Foreword

I believe we are all familiar with structures, but structures belong to a custom type, so what is special about these custom types? This issue will be described below.

1. Structure

1. Declaration of structure

1.1 Basic knowledge of structures

A structure is a collection of values called member variables. Each member of the structure can be a variable of different types.

1.2 Declaration of structure

Structure declarations can be placed in .h files.

1.3 Special Statement

//Declaration of anonymous structure
struct
{
    int a;
    char b;
    float c;
}x;
struct
{
    int a;
    char b;
    float c;
}a[20], *p;

The struct above omits the struct tag.

Then a question arises. Based on the above code, is the following code still legal?

p = & amp;x;

Obviously it is illegal. We debugged it in VS and found that an error will be reported.

Warning:

The compiler will treat the above two declarations as two completely different types

So it’s illegal.

1.4 Self-reference of structure

struct Node
{
    int data;
    struct Node next;
};

If you follow the above code and use the structure to reference itself, when we debug it in vs, we will find that an error will be reported. The structure cannot be self-referenced in this way. Let’s try another method of self-referencing.

struct Node
{
    int data;
    struct Node* next;
};

There is absolutely no problem in writing like this. The structure can only reference itself in the form of a pointer.

typedef struct
{
    int data;
    Node* next;
}Node;

Is it feasible to look at the above code?

The answer is no, why?

Look at it written like this again.

typedef struct Node
{
    int data;
    struct Node* next;
}Node;

After debugging results obtained on VS, it was found that only the second type will not report an error. This shows that when declaring a structure, we often declare the label name first and then declare the internal members. If we do not write the label, the member variables will be The redefined name cannot be used directly, and the compiler often executes it from top to bottom.

1.5 Definition and initialization of structure variables

struct Point
{
int x;
int y;
}p1; //Declare the type and define variable p1 at the same time

struct Point p2; //Define structure variable p2

struct Point p3 = { x, y };
//Initialization: Define variables and assign initial values at the same time.

struct Stu //Type declaration
{
char name[15];//name
int age; //Age
};
struct Stu s = { "zhangsan", 20 };//Initialization

struct Node
{
int data;
struct Point p;
struct Node* next;
}n1 = { 10, {4,5}, NULL }; // Nested initialization of structure

struct Node n2 = { 20, {5, 6}, NULL }; // Nested initialization of structure

1.6 Structure memory alignment

We have already understood the knowledge of structure declaration and definition. Next

How is the structure size calculated?

How is it stipulated in the rules of C language?

Let’s discuss it below: Calculating structure size

//Exercise 1
struct S1
{
char c1;
int i;
char c2;
};
printf("%d\\
", sizeof(struct S1));
//Exercise 2
struct S2
{
char c1;
char c2;
int i;
};
printf("%d\\
", sizeof(struct S2));
//Exercise 3
struct S3
{
double d;
char c;
int i;
};
printf("%d\\
", sizeof(struct S3));
//Exercise 4-structure nesting problem
struct S4
{
char c1;
struct S3 s3;
double d;
};
printf("%d\\
", sizeof(struct S4));

How are the above codes calculated?

Let’s not talk about how the structure size is calculated?

We calculated this size and found that it is quite different from the result of the compiler execution.

why is that?

Two char and one int type, shouldn’t it be 6 bytes?

This involves memory alignment

1. The first member is at the address offset 0 from the structure variable.
2. Other member variables should be aligned to an address that is an integer multiple of a certain number (alignment number).
Alignment number = The compiler’s default alignment number and the smaller of the member’s size.
The default value in VS is 8
3. The total size of the structure is an integer multiple of the maximum alignment number (each member variable has an alignment number).
4. If a structure is nested and the nested structure is aligned to an integer multiple of its own maximum alignment number, the overall size of the structure is an integer of all maximum alignment numbers (including the alignment number of nested structures) times.

Why does memory alignment exist?

This is what the collection of materials says:

1. Platform reasons: (transplantation reasons)

Not all hardware platforms can access any data at any address; some hardware platforms can only fetch certain types of data at certain addresses, otherwise a hardware exception will be thrown.

2. Performance reasons:

Data structures (especially stacks) should be aligned on natural boundaries whenever possible.
The reason is that in order to access unaligned memory, the processor needs to make two memory accesses; aligned memory access requires only one access.

In general:

We understand that structure memory alignment is a way to trade space for time.

So how can we waste as little space as possible and achieve higher efficiency?

Here’s a way:

Gather variables that take up little space together as much as possible, which can greatly reduce the waste of space.

//For example:
struct S1
{
    char c1;
    int i;
    char c2;
};
struct S2
{
    char c1;
    char c2;
    int i;
};

For two different declarations, the variable contents are the same, but their sizes will be completely different, just because of the difference in variable location.

1.7 Modify the default alignment number

Since the default alignment number is so important, how do we modify it?

How to modify the default alignment number?

There is a preprocessing directive #pragma

After modification, we can see that the size of the structure has changed significantly.

The () before the structure declaration is the value of the modified default alignment number. We can cancel this modification after the declaration.

Just don’t fill anything inside ().

1.8 Structure parameter passing

Structures can also pass parameters, so how to pass parameters?

struct S
{
int data[1000];
int num;
};

void print1(struct S s)
{
printf("%d", s.num);
}
void print2(struct S* s)
{
printf("%d", s->num);
}
int main()
{
struct S s = { {1,2,3,4},1};
print1(s);
print2( &s);
return 0;
}

Looking at the code above, there are two ways to pass parameters to the structure, calling by value and calling by address.

Let’s imagine that when we call by value, we need to push the stack. If the structure we define is very large, the call by value will push the stack. This is not feasible, so we often still have to pass the address. In this way, you only need to find the memory address where the original structure is located, which greatly reduces space consumption. The efficiency is very high.

Summarize:

In the future, we should use the form of address transfer when passing structure parameters.

2. Bit segment

After talking about structures, we have to talk about the ability of structures to implement bit segments.

This is something that very few teachers in universities can teach.

But here we are going to discuss what is a bit segment?

2.1 What is a bit segment

The declaration and structure of bit fields are similar, with two differences.

  • The member type of a bit field must be int, unsigned int, or signed int.
  • The member name of the bit field is followed by a colon and a number.

for example:

struct A
{
    int _a:2;
    int _b:5;
    int _c:10;
    int _d:30;
};

This is a bit segment type.

What is the size of this bit segment type?

Why 8 bytes?

We will discuss this further below.

Memory allocation for 2.2-bit segments

1. Members of the bit field can be int unsigned int signed int or char (belonging to the integer family) type
2. The space of the bit field is allocated in 4 bytes (int) or 1 byte (char) as needed.
3. Bit segments involve many uncertain factors. Bit segments are not cross-platform. Programs that focus on portability should avoid using bit segments.

//An example
struct S
{
char a : 3;
char b : 4;
char c : 5;
char d : 4;
};
int main()
{
struct S s = { 0 };
s.a = 10;
s.b = 12;
s.c = 3;
s.d = 4;
return 0;
}
//How is space opened up? 

Cross-platform issues in 2.3 bit range

1. It is uncertain whether the int bit field is regarded as a signed number or an unsigned number.
2. The maximum number of bits in the bit field cannot be determined. (The maximum number for a 16-bit machine is 16, and the maximum number for a 32-bit machine is 32. Writing it as 27 will cause problems on a 16-bit machine.
3. Whether members in a bit segment are allocated from left to right or right to left in memory has not yet been defined.
4. When a structure contains two bit fields, and the members of the second bit field are larger and cannot accommodate the remaining bits of the first bit field, it is uncertain whether to discard the remaining bits or use them.

Summarize:

Bit segments can achieve the same effect as structures, but bit segments can save space better, but there are cross-platform problems.

Application of 2.4 bit segment

2. Enumeration

Enumeration, as the name suggests, is to enumerate one by one

We have seven days in a week that we can list one by one

Colors can also be listed one by one

So here is the definition of the enumeration

1. Definition of enumeration types

enum Day//week
{
Mon,
Tues,
Wed,
Thur,
Fri,
Sat,
Sun
};
enum Sex//gender
{
MALE,
FEMALE,
SECRET
};
enum Color//color
{
RED,
GREEN,
BLUE
};

The above definition is the enumeration type. enum is the keyword of the enumeration. After enum is the name of the enumeration. Everything within {} is the enumeration constant. Enumeration constants are separated by commas, and no comma is required after the last one.

These possible values are all valuable, starting from 0 by default and incrementing by 1 at a time. Of course, an initial value can also be assigned when defining.
For example:

enum Color//color
{
    RED=1,
    GREEN=2,
    BLUE=4
};

2. Advantages of enumeration

Why use enumerations?

We can obviously use #define to define macro constants, why do we have to use enumerations?

Advantages of enumerations:

1. Increase code readability and maintainability
2. Compared with identifiers defined by #define, enumerations have type checking, which is more rigorous.
3. Prevents naming pollution (encapsulation)
4. Easy to debug
5. Easy to use, multiple constants can be defined at one time

3. Use of enumerations

enum Color//color
{
RED = 1,
GREEN = 2,
BLUE = 4
};
enum Color clr = GREEN;//Only enumeration constants can be used to assign values to enumeration variables, so that there will be no type differences.
clr = 5; 

3. Union (union)

1. Definition of union type

Union is also a special custom type,

The union also contains some members. The special thing is that these members share the same space, so the union is also called a community.

union Un
{
int i;
char a;
};
int main()
{
union Un un;
printf("%d", sizeof(un));
return 0;
}

Declare, define, and calculate the size of the union, as shown in the above code.

2. Characteristics of union

The members of a union share the same memory space, so the size of a union variable must be at least the size of the largest member (because the union must be able to save at least the largest member)

union Un
{
int i;
char a;
};
int main()
{
union Un un;
printf("%d\\
", sizeof(un));
//Are the results output below the same?
printf("%x\\
", & amp;(un.i));
printf("%x\\
", & amp;(un.a));

//What is the result output below?
un.i = 0x11223344;
un.a = 0x55;
printf("%x\\
", un.i);
return 0;
}

The characteristics of such a union are obvious. We can know that the two member variables of the union start to open up space at the same address. Since the vs compiler stores data in little-endian byte order, the low byte is placed at the low address. , so the printed result overlaps the low bytes.

3. Calculation of union size

  • The size of the union is at least the size of the largest member.
  • When the maximum member size is not an integer multiple of the maximum alignment number, it must be aligned to an integer multiple of the maximum alignment number.
    union Un1
    {
    char c[5];
    int i;
    };
    union Un2
    {
    short c[7];
    int i;
    };
    //What is the result output below?
    int main()
    {
    printf("%d\\
    ", sizeof(union Un1));
    printf("%d\\
    ", sizeof(union Un2));
    return 0;
    }

Obviously the union size calculation is similar to the memory alignment of the structure (note that the alignment number of the array is calculated based on the size of one element of the array)

Summary:

At this point, I have finally gone over the structure-related content. This is a summary and review of the structure section. I would like to point out any inappropriate points to readers, and I will continue to update the article later.