Data storage in memory

Table of Contents

Introduction to data types

Basic classification of types:

Storage of shaping in memory

Original code, inverse code, complement code

Introduction to big and small endian

practise

Exercise 1

Exercise 2

Exercise 3

Exercise 4

Exercise 5

Floating point type storage in memory

one example

Floating point number storage rules

Their value ranges can be viewed in the following two files:

Value range of types of integer family: limits.h

The value range of floating-point family types: float.h

Different ways of printing can be said to be interpreted differently.

Introduction to data types

char //Character data type

short //Short integer type

int //shaping

long //long integer type

long long //longer integer

float //single precision floating point number

double //double precision floating point number

The C language itself has such a type built in. Because the character itself is an ASCII code value when used, char is classified into the integer family.

Type meaning:

1. Use this type to open up the size of the memory space (the size determines the usage range).

2. How to view the memory space.

Basic classification of type:

Integer family: (Note whether char is equal to signed char depends on the compiler)

char

unsigned char

signed char

short

unsigned short [int] //short [int] is just int? It is usually omitted.

signed short [int]

int

unsigned int

signed int

long

unsigned long [int]

signed long [int]

Floating point family:

float

double

Constructed type: (?Definition type)

> Array type

> Structure type struct

> Enumeration type enum

> Union type union

Pointer type

int *pi;

char*pc;

float* pf;

void* pv;

Empty type:

void represents an empty type (no type)

Usually applied to function return types, function parameters, and pointer types.

Storage of plastic shapes in memory

We have said before that the creation of a variable requires space in memory. The size of the space is determined according to different types.

Then let’s talk about how data is stored in the allocated memory?

Original code, inverted code, complemented code

There are three binary representation methods for integers in computers, namely original code, complement code and complement code.

The three representation methods all have two parts: sign bit and numeric bit. The sign bit uses 0 to represent “positive” and 1 to represent “negative”, while the numerical bit < strong>The original, inverse and complement codes of positive numbers are the same.

There are three ways to represent negative integers in different ways.

Original code

The original code can be obtained by directly translating the value into binary in the form of positive and negative numbers.

reverse code

The one’s complement code can be obtained by keeping the sign bit of the original code unchanged and inverting the other bits bit by bit.

complement

The complement code + 1 is the complement code.

For shaping: the data stored in the memory actually stores the complement code.

In computer systems, numerical values are always represented and stored using two’s complement codes. The reason is that using the complement code, the sign bit and the numerical field can be processed uniformly; at the same time, addition and subtraction can also be processed uniformly (the CPU only has an adder). In addition, the complement code and the original code are converted to each other, and the operation process is the same. Additional hardware circuitry is required.

Introduction to big and small endian

What big endian little endian:

Big-endian (storage) mode means that the low-order bits of data are stored in the high address of the memory, and the high-order bits of the data are stored in the low address of the memory; (?Endian storage)

Little endian (storage) mode means that the low bits of data are stored in the low addresses of the memory, and the high bits of the data are stored in the high addresses of the memory. (little endian storage)

Why are there big endian and little endian:

Why are there big and small endian modes? This is because in computer systems, we use bytes as units. Each address unit corresponds to a byte, and a byte is 8 bits. However, in C language, in addition to the 8-bit char, there are also the 16-bit short type and the 32-bit long type (depending on the specific compiler). In addition, for processors with more than 8 bits, such as 16-bit Or for a 32-bit processor, since the register width is larger than one byte, there must be a problem of how to arrange multiple bytes. This leads to big-endian storage mode and little-endian storage mode.

For example: a 16-bit short type x, the address in the memory is 0x0010, and the value of x is 0x1122, then 0x11 is the high byte and 0x22 is the low byte. For big-endian mode, put 0x11 in the low address, that is, 0x0010, and 0x22 in the high address, that is, 0x0011. Little endian mode is just the opposite. Our commonly used X86 structure is little endian mode, while KEIL C51 is big endian mode. Many ARM and DSP are in little-endian mode. Some ARM processors can also select big-endian or little-endian mode by hardware.

Baidu 2015 system engineer written examination questions:

Please briefly describe the concepts of big-endian and little-endian, and design a small program to determine the byte order of the current machine. (10 points)

int main()
{

    int a = 1;
    char *p = (char*) &a;
    if(*p==1)
    {
       printf("?End\
");
    }
    else
    {
       printf("?End\
");
    }
     return 0;

}

Practice

Exercise 1 (Plastic Lifting)

//What to output?

#include

int main()

{

char a= -1;

signed char b=-1;

unsigned char c=-1;

printf(“a=%d,b=%d,c=%d”,a,b,c);

return 0;

}

Note that this question is -1, -1, 255

The original code of -1 is 10000000 00000000 00000000 00000001

Inverse code 11111111 11111111 11111111 11111110

Complement code 11111111 11111111 11111111 11111111

Put it in char and it will truncate 11111111 – 255

Plastic surgery will be done later

11111111 11111111 11111111 11111111

Complement code 10000000 00000000 00000000 00000000

Original code 10000000 00000000 00000000 00000001

We get -1

As mentioned during the integer promotion, for ? signed numbers, ? bits are directly filled with 0, so for unsigned char

Complementary code? After promotion, it is 00000000 00000000 00000000 11111111

When printing again, I found that the ? bit is 0. If the positive and negative are the same, I will print it directly, which is 255.

If the operation type does not reach shaping, shaping promotion will be performed. After achieving shaping, arithmetic conversion is performed

Exercise 2

#include

int main()

{

char a = -128;

printf(“%u\
“,a);

return 0;

}

The stored formulas are all two’s complement, but the interpretation formulas of %d and %u are different. %d, positive numbers are printed directly, negative numbers are converted into original codes for printing, and %u does not care what positive number you are. and negative numbers are directly printed according to the original negation and complement of positive numbers (of course, they also need to be improved by shaping)

Exercise 3

int i= -20;

unsigned int j = 10;

printf(“%d\
“, i + j);

//Perform operations in the form of two’s complement, and finally format it into a signed integer

Print -10, because there is no truncation and the final result is still the same.

Of course%u the result will be different

For signed char

#include <stdio.h>
#include <unistd.h>
int main()
{
    unsigned int i;
    for(i = 9; i >= 0; i--)
   {
     printf("%u\
",i);
     sleep(1);//The unit under mac is seconds, the unit under windows (Windows.h) is milliseconds
   }

    return 0;
}

Exercise 4

int main()
{
    char a[1000];
    int i;
    for(i=0; i<1000; i + + )
    {
        a[i] = -1-i;
    }
    printf("%d",strlen(a));
    return 0;
}

Exercise 5

#include <stdio.h>

unsigned char i = 0;
int main()
{

    for(i = 0;i<=255;i + + )
    {
        printf("hello world\
");
    } return 0;

}

Infinite loop, so be careful when using unsigned numbers.

Storage of floating point types in memory

Common floating point numbers:

3.14159

1E10

The floating point number family includes: float, double, long double types.

The range of floating point number representation: defined in float.h

an example

Example of floating-point number storage: You can see that there are differences in the access of integers and floating-point types in memory.

int main()

{

int n = 9;

float* pfloat = (float*) &n;

printf(“The value of n is: %d\
“,n);

printf(“The value of *pFloat is: %f\
“,*pFloat);

*pFloat = 9.0;//Put it in as a floating point number

printf(“The value of n is: %d\
“,n);

printf(“The value of *pFloat is: %f\
“,*pFloat);

return 0;

}

Floating point number storage rules

num and *pFloat are obviously the same number in memory. Why are the interpretation results of floating point numbers and integers so different?

To understand this result, you must understand how floating point numbers are represented internally in the computer.

Detailed interpretation:

According to the international standard IEEE (Institute of Electrical and Electronics Engineering) 754, any binary floating point number V can be expressed in the following form:

(-1)^S * M * 2^E

(-1)^S represents the sign bit. When S=0, V is a positive number; when S=1, V is a negative number.

M represents a valid number, greater than or equal to 1 and less than 2.

2^E represents the exponent bit.

for example:

5.0 in decimal is 101.0 in binary, which is equivalent to 1.01×2^2.

Then, according to the format of V above, we can get S=0, M=1.01, E=2.

-5.0 in decimal is -101.0 written in binary, which is equivalent to -1.01×2^2. Then, S=1, M=1.01, E=2.

IEEE 754 stipulates:

For a 32-bit floating point number, the highest 1 bit is the sign bit s, the next 8 bits are the exponent E, and the remaining 23 bits are the significant digit M.

For a 64-bit floating point number, the highest 1 bit is the sign bit S, the next 11 bits are the exponent E, and the remaining 52 bits are the significant digit M.

IEEE 754 also has some special provisions for the significant digit M and the exponent E.

As mentioned before, 1≤M<2, that is to say, M can be written in the form of 1.xxxxxx, where xxxxxx represents the decimal part.

IEEE 754 stipulates that when M is stored inside the computer, the first digit of this number is always 1 by default, so it can be discarded and only the following xxxxxx part is saved.

For example, when saving 1.01, only 01 is saved, and when reading, the first 1 is added. The purpose of this is to save 1 significant figure. Taking a 32-bit floating point number as an example, only 23 bits are left for M. After the first 1 is rounded off, 24 significant digits can be saved.

As for the index E, the situation is more complicated.

First, E is an unsigned integer (unsigned int)

This means that if E is 8 bits, its value range is 0~255; if E is 11 bits, its value range is 0~2047. However, we know that E in scientific notation can be negative.

Therefore, IEEE 754 stipulates that the real value of E must be added to an intermediate number when stored in memory,

For an 8-bit E, this intermediate number is 127;

For an 11-digit E, this intermediate number is 1023.

For example, the E of 2^10 is 10, so when it is saved as a 32-bit floating point number, it must be saved as 10 + 127 = 137, which is 10001001.

Then, the index E is taken out from the memory and can be further divided into three situations:

E is not all 0 or not all 1

At this time, the floating point number is represented by the following rules: subtract 127 (or 1023) from the calculated value of the exponent E to obtain the real value, and then add the first 1 before the significant digit M.

for example:

The binary form of 0.5 (1/2) is 0.1. Since it is stipulated that the positive part must be 1, that is, the decimal point is moved to the right by 1 place, then it is

1.0*2^(-1), its exponent is -1 + 127=126, expressed as

01111110, and the mantissa 1.0 removes the integer part to 0, and fills in 0 to 23 digits 00000000000000000000000, then its binary representation is:

0 01111110 00000000000000000000000

E is all 0

At this time, the exponent E of the floating point number is equal to 1-127 (or 1-1023), which is the real value.

The effective digit M is no longer added to the first digit of 1, but restored to a decimal of 0.xxxxxx. This is done to represent ±0, and very small numbers close to 0.

E is all 1

At this time, if the significant digits M are all 0, it means ±infinity (the sign bit depends on the sign bit s);

Explain the previous question:

Now, let’s go back to the original question: Why is 0x00000009 reduced to a floating point number and becomes 0.000000?

First, split 0x00000009 to get the first sign bit s=0, and the following 8-bit exponent E=00000000.

The last 23 significant digits M=000 0000 0000 0000 0000 1001.

9 -> 0000 0000 0000 0000 0000 0000 0000 1001

Since the exponent E is all 0, it conforms to the second case in the previous section. Therefore, the floating point number V is written as:

V=(-1)^0 × 0.00000000000000000001001×2^(-126)=1.001×2^(-146)

Obviously, V is a small positive number close to 0, so expressed in decimal notation is 0.000000.

Let’s look at the second part of the example.

How to express the floating point number 9.0 in binary? How much does it convert to decimal?

First, the floating point number 9.0 is equal to 1001.0 in binary, which is 1.001×2^3.

9.0 -> 1001.0 ->(-1)^01.0012^3 -> s=0, M=1.001,E=3 + 127=130

Then, the first sign bit s=0, the effective number M is equal to 001, followed by 20 0s, to complete 23 bits, and the exponent E is equal to 3 + 127 = 130, which is 10000010.

Therefore, written in binary form, it should be s + E + M, that is

0 10000010 001 0000 0000 0000 0000 0000

This 32-bit binary number, converted to decimal, is exactly 1091567616.

The knowledge points of the article match the official knowledge files, and you can further learn related knowledge. C Skill Tree Home Page Overview 194948 people are learning the system