Falling in love with C language: storage of integers and floating point types in memory (base conversion, original code, complement, complement and big and small endian)

Author: Ah Hui is extraordinary
What do you think: Life is dull, but running is windy
Column: Falling in love with C language
Drawing tool: draw.io(Free and open source drawing website)

If you think the article is helpful to you, please like, follow, and collect to support the blogger. If there are any deficiencies, please point out and the blogger will correct them in time. Thank you for your support! ! !

Article directory

Foreword
type of data
Storage of integers in memory
- Hexadecimal conversion
- Original code, inverse code and complement code
Endianness
Storage of floating point types in memory
- Storage rules for floating point types
- Reading rules for floating point numbers

Foreword

Hello everyone! Today Ah Hui will introduce to you the storage of integers and floating point types in memory in C language, including base conversion, original code, complement, complement and big and small endian. Next, he will also introduce about the original and complement. For code-related examples, please pay attention to Ah Hui to not get lost, the content is full of useful information, let’s study with Ah Hui next

Data type

integer

char
unsigned char - unsigned
signed char - signed
short
unsigned short [int]
signed short [int]
int
unsigned int
signed int
long
unsigned long [int]
signed long [int]
long long

You may be wondering, isn’t char a character type?
Because characters are stored in memory as ASCII values, they can also be divided into integers.

floating point

float single precision floating point type
double double precision floating point type

Construction type

Array type
Structure type struct
enum type enum
union type union

pointer type

int * ip;
char*cp;
float * fp;
void * vp;

Empty type: void represents an empty type (no type), usually applied to the return type of a function, function parameters, and pointer types.
The type of data determines the size of the memory space for stored data, as well as how the data is stored in the memory and how it is retrieved from the memory

Given a type such as int and float we know that both occupy 4 bytes in memory
How is the data of the variables created by them stored in the allocated memory?

Storage of integers in memory

Hexadecimal conversion

The C language stipulates that the octal number must be preceded by a number 0, and the hexadecimal number must be preceded by 0x
A decimal number represents a decimal number without adding anything.
Decimal to binary:
Please add image description
Binary to octal:

Each digit of an octal number is 0 ~ 7, and the numbers 0 ~ 7 are each written in binary. A maximum of three binary digits is enough. For example, the binary number of 7 is 111, so when converting to binary When using octal numbers, every 3 binary digits starting from the low digit on the right and going to the left in the binary sequence will be converted into an octal digit. If there are not enough 3 binary digits left, the conversion will be done directly.

chestnut
Binary 01101011 is converted into octal: 0153. Numbers starting with 0 will be regarded as octal.
Please add an image description
Binary to hexadecimal:

Each digit of a hexadecimal number is 0~9,a~f, and the numbers 0~9,a~f are written in binary. A maximum of 4 binary digits is enough, such as the binary of f is 1111, so when converting from binary to hexadecimal, every 4 binary digits starting from the low digit on the right to the left in the binary sequence will be converted into ? 16 digits, and the remaining 4 binary digits will not be enough. Direct conversion.
In hexadecimal, 10~15 is represented by a,b,c,d,e,f

chestnut
01101011 in binary is converted to hexadecimal: 0x6b. When expressed in hexadecimal, add 0x before ?
Please add a picture description

Integers are stored in memory in the form of complement code. What is complement code?
Let’s see next

Original code, inverse code and complement code

The sign bit is the highest bit of the binary number. The sign bit is 1 and the number is negative. If the sign bit is 0 the number is positive.
The original and complement codes of positive numbers are the same
The original complement of a negative number needs to be calculated as follows:

Original code
You can get the original code by directly translating the value into binary in the form of positive and negative numbers.
Reverse code
Keep the sign bit of the original code unchanged, and invert the other bits bit by bit to get the complement code.
Complement code
The complement code + 1 is the complement code

Get the original code from the complement code You can also keep the sign bit of the complement code unchanged, invert the other bits bit by bit, and then + 1 to get the original code

Here we take the most common integer type int as an example to introduce how integers are stored in memory.
int occupies 4 bytes, which is 32 bits
chestnut

int a = 20;
0000 0000 0000 0000 0000 0000 0001 0100 -> Original code
0000 0000 0000 0000 0000 0000 0001 0100 -> reverse code
0000 0000 0000 0000 0000 0000 0001 0100 -> complement
0x 00 00 00 14 -> 20’s complement hexadecimal representation
int b = -10;
1000 0000 0000 0000 0000 0000 0000 1010 ->Original code
1111 1111 1111 1111 1111 1111 1111 0101 ->Inverse code //The sign bit of the original code remains unchanged, and the other bits are inverted bit by bit.
1111 1111 1111 1111 1111 1111 1111 0110 ->complement code//reverse code + 1
0xff ff ff ff f6 -> -1’s complement hexadecimal representation

Integers are stored in memory in binary form, but the binary number is too long to display. In the vs2022 environment, the debugging memory is displayed in hexadecimal (the following pictures are from the debugging results in the vs2022x86 environment)

For the storage of positive numbers, we cannot get that the memory stores integers in two’s complement format. Let’s take a look at negative numbers.

Through the above demonstration, we know that integers are stored in binary form in memory. Why is this?

The reason is that by using the two’s complement code, the sign bit and the numerical domain can be processed uniformly;
At the same time, addition and subtraction can also be processed in a unified manner (CPU only has adders). In addition, the complement code and the original code are converted to each other. The operation process is
Same, no additional hardware circuit required

But have you ever discovered that these numbers are stored backwards in the memory?
Why is this? Let’s look down ↓

Big-endian byte order

We understand how integers are stored in memory, but they are stored backwards. Let’s take a look
chestnut

#include <stdio.h>
int main()
{<!-- -->
 int a = 0x11223344; Hexadecimal numbers are easy to observe
 return 0;
}

This brings up an important concept: big and small endianness
What is big endianness?
In fact, when more than one byte of data is stored in memory, there is a problem of storage order. According to different storage orders, we are divided into big-endian storage and little-endian storage Sequential storage, the following is the specific concept:
What is byte order? It means sorting in bytes, and each byte is a whole

Big endian (storage) mode: means that the low-order byte content of the data is stored at the high address of the memory, and the high-order byte content of the data is stored at the low address of the memory.
Little endian (storage) mode: means that the low-order byte content of the data is stored at the low address of the memory, and the high-order byte content of the data is stored at the high address of the memory.

From the picture above we can know that vs2022 is little endian mode
Why is there big and small endianness?

This is because in the computer system, we use bytes as the unit. Each address unit corresponds to a byte, and each byte is 8 bits. However, in C language, in addition to the 8-bit char , there are also 16-bit short type and 32-bit long type (depending on the specific compiler). In addition, for processors with more than 8 bits, such as 16-bit or 32-bit processors, since the register width is greater than one Bytes, then there must be a problem of how to arrange multiple bytes. This leads to big-endian storage mode and little-endian storage mode.

So how to use a simple code to determine the byte order of the current machine? Let’s see next
Please add a picture description

#include <stdio.h>
int check_sys()//Encapsulates a function to determine which mode the current machine is in
{<!-- -->
int i = 1;
The number 1 is of type int and has a 4-byte complement.
0000 0000 0000 0000 0000 0000 0000 0001
If it is little endian mode, the first byte of its low address is 0000 0001
If it is big endian mode, the first byte of its low address is 0000 0000
return (*(char *) & amp;i);
After coercing & amp;i into char* type, the dereference operation of & amp;i can only access one byte.
If it is little endian mode, the return value is 1
If it is big endian mode, the return value is 0
}
int main()
{<!-- -->
int ret = check_sys();
if(ret == 1)
{<!-- -->
printf("?End\\
");
}
else
{<!-- -->
printf("?End\\
");
}
return 0;
}

Above, I introduced the big and small endian through the int type, but not only the int type needs to be stored in big and small endian byte order, as long as the space is larger than one byte, the size will be used. Endian order to store integers and floating point types

Storage of floating point types in memory

Floating point types include float double long double. So how are floating point types stored? Is it the same as integer type? Let’s see next

Floating point storage rules

According to the international standard IEEE (Institute of Electrical and Electronics Engineering) 754, any binary floating point number V can be expressed in the following form:

(

?

1

)

S

?

M

?

2

E

(-1)^{S}*M*2^{E}

(?1)S?M?2E
(

?

1

)

S

(-1)^{S}

(?1)S represents the sign bit, S is 1 to represent a negative number, S is 0 to represent a positive number
M

M

M represents significant figures,

M

M

The value of M is greater than or equal to 1 and less than 2
2

E

2^{E}

2E represents the exponent bit

I’m a little confused. It doesn’t matter. I’ll give you a chestnut.

For example, a floating point number 5.0f
5.0 in binary is 101.0
And 101.0 = (-1)^0 * 1.01 * 2^2//Here (-1)^0 represents -1 to the 0th power, * represents the multiplication sign
And S = 0, M = 1.01, E = 2
while -5.0f
It can be expressed as (-1)^1 * 1.01 * 2^2
S = 1, M = 1.01, E = 2

IEEE 754 stipulates:
For a 32-bit floating point number, which is the float type, the highest 1 bit is the sign bit S, the next 8 bits are the exponent E, and the remaining 23 bits are the significant digit M
Please add a picture description
For a 64-bit floating point number, the highest 1 bit is the sign bit S, the next 11 bits are the exponent E, and the remaining 52 bits are the significant digit M

IEEE 754 also has some special provisions for the significant digit M and the exponent E.
As mentioned before, 1≤M<2, that is to say, M can be written in the form of 1.xxxxxx, where xxxxxx represents the decimal part.
IEEE 754 stipulates that when M is stored inside the computer, the first digit of this number is always 1 by default, so it can be discarded and only the following xxxxxx parts are saved. For example, when saving 1.11, only 11 is saved, and when reading, the first 1 is added. The purpose of this is to save 1 significant figure. Taking a 32-bit floating point number as an example, there are only 23 bits left for M. After the first 1 is rounded off, 24 significant digits can be saved.
As for the index E, the situation is more complicated. First, E is an unsigned integer (unsigned int)
This means that if E is 8 bits, its value range is 0 ~ 255; if E is 11 bits, its value range is 0 ~ 2047. However, we know that E in scientific notation can be negative, so IEEE 754 stipulates that the real value of E must be added to an intermediate number when stored in memory. For an 8-bit E, this intermediate number is 127; For the 11-bit E, the middle
The number is 1023. For example, the E of 2^10 is 10, so when it is saved as a 32-bit floating point number, it must be saved as 10 + 127 = 137, which is 10001001

chestnut

float f = 5.5f;
Its binary representation 101.1//0.5 is 2 to the power of -1
Also equal to (-1)^0 * 1.011 * 2^2
where S = 0, M = 1.011, E = 2 + 127 = 129
The binary representation of 129 is 10000001

Please add an image description
But there are some floating point numbers that cannot be represented. The number after the decimal point cannot be fully expressed, and may always be slightly off. There may be a loss of accuracy
Above we understand how floating point numbers are stored in memory, so how do we take them out? Let’s see next

Reading rules for floating point numbers

There are three situations here:

When there are 1 and 0 in E:
At this time, floating point numbers follow the following rules:
The stored value of exponent E is subtracted from 127 (or 1023), and the effective value M is preceded by 1.
chestnut
When all E’s are 0:
At this time, the exponent E of the floating point number is equal to 1-127 (or 1-1023), which is the real value. The effective number M no longer adds the first 1, but is reduced to a decimal of 0.xxxxxx. This is done to represent ±0, and very small numbers close to 0
When E hits all 1s
At this time, if the significant digits M are all 0, it means ±infinity (the sign bit depends on the sign bit s)

At this point, Ah Hui’s sharing about the storage of integers and floating point types in memory in C language ends today. I hope this blog can help everyone gain something. If you think Ah Hui’s writing is good, please remember to give it a Thank you, your support is the biggest motivation for my creation
Please add a picture description