C language_storage of data in memory – original code, inverse code, complement code, big-endian and little-endian storage

Table of Contents

1. Summary of type classification and memory size

2. Original code, inverse code, complement code

1) Storage of integers in memory

2) Storage of floating point numbers in memory

3. Big endian storage and little endian storage


1. Summary of type classification and memory size

Type Memory usage (unit bytes)
Integer family char unsigned char 1
signed char
short unsigned short 2
signed short
int unsigned int 4
signed int
long unsigned long

4 or 8

(The standard specifies sizeof(long) >= sizeof(int) )

signed long
long long unsigned long long 8
signed long long
Floating point family float 4
double 8
Constructed type Array type Generally, the memory usage is the number of elements multiplied by the element type
struct A type composed of its members and alignment determination
enum Normally, an enumeration constant will be stored as an int type
union union is sized according to its largest member
Pointer type * 4 or 8
Empty type
td>

void

Here is an interesting piece of code to help you understand the usage of various types of memory storage space.

int main()
{
  char a[1000] = {0};
  int i=0;
  for(i=0; i<1000; i + + )
  {
    a[i] = -1-i;
  }
  printf("%d",strlen(a));
  return 0;
}

The output of this code is 255

The reason is: a is a character array, and strlen is looking for the position where the last zero (that is, the value is 0) appears for the first time. Considering that a[i] is actually a character type, if it wants to be 0, the lower eight bits of -1-i need to be all 0, which means the problem is simplified to “find when the result of -1-i first appears low When all eight bits are 0, the value of i” (because the trailing zero appears for the first time when the character array index is i, the length of the string is i). If we only look at the lower eight bits, -1 is equivalent to 255 at this time, so when i==255, the lower eight bits of -1-i (255-255) are all 0, that is, when i is 255, a[i] is 0 for the first time, so the length of a[i] is 255

The following two figures are used to compare and understand the storage of character types and unsigned character types.

2. Original code, inverse code, complement code

1) Storage of integers in memory

There are three representation methods for signed numbers in computers, namely original code, complement code and complement code. All three representation methods have two parts: sign bit and numerical bit. The sign bit uses 0 to represent “positive” and 1 to represent “negative”, while the three representation methods of numerical bit are different.

Original code: Just translate binary into binary in the form of positive and negative numbers.

One’s complement code: Keep the sign bit of the original code unchanged, and invert the other bits bit by bit.

Complement code: complement code + 1 to get the complement code.

What needs to be noted here is that the original code, inverse code, and complement code of the integer are all the same, and they are all in the form of the original code.

For shaping: the data stored in the memory actually stores the complement code.

2) Storage of floating point numbers in memory

  • Regulations

According to the international standard IEEE (Institute of Electrical and Electronics Engineering) 754, any binary floating point number V can be expressed in the following form: (-1)^S * M * 2^E

(-1)^s represents the sign bit. When s=0, V is a positive number; when s=1, V is a negative number.

M represents a valid number, while1\leq M<2 .

2^E represents the exponent bit.

For example: 5.0 in decimal is 101.0 in binary, which is equivalent to 1.01×2^2. Then, according to the format of V above, we can get s=0, M=1.01, E=2.

  • Storage method

For a 32-bit floating point number, the highest 1 bit is the sign bit s, the next 8 bits are the exponent E, and the remaining 23 bits are the significant digit M.

For a 64-bit floating point number, the highest 1 bit is the sign bit S, the next 11 bits are the exponent E, and the remaining 52 bits are the significant digit M.

M’s storage

In addition, when saving M inside the computer, the first digit of this number is always 1 by default, so it can be discarded and only the following xxxxxx part is saved. For example, when saving 1.01, only 01 is saved.

Storage of E

First of all, E is an unsigned integer (unsigned int). This means that if E is 8 bits, its value range is 0~255; if E is 11 bits, its value range is 0~2047. However, we know that E in scientific notation can be a negative number, so IEEE 754 stipulates that an intermediate number must be added to the real value of E when stored in memory. For an 8-bit E, this intermediate number is 127; For an 11-digit E, this intermediate number is 1023. For example, the E of 2^10 is 10, so when it is saved as a 32-bit floating point number, it must be saved as 10 + 127 = 137, which is 10001001.

  • Retrieve rules from memory

When E is not all 0 or not all 1: Floating point numbers are represented by the following rules, that is, subtract 127 (or 1023) from the calculated value of the index E to get the real value, and then add the significant digit M before Add the first 1. For example: The binary form of 0.5 (1/2) is 0.1. Since the positive part must be 1, that is, the decimal point is moved to the right by 1 place, then it is 1.0*2^(-1), and its exponent code is -1 + 127= 126, expressed as 01111110, and the mantissa 1.0 removes the integer part to 0, and fills in 0 to 23 digits 000000000000000000000000, then its binary representation is 0 01111110 000000000000000000000000

E is all 0: The exponent E of the floating point number is equal to 1-127 (or 1-1023), which is the real value. The effective number M no longer adds the first 1, but is restored to 0 .xxxxxx decimal. This is done to represent ±0, and very small numbers close to 0.

E is all 1: If the significant digits M are all 0, it means ± infinity (positive or negative depends on the sign bit s)

3. Big-endian storage and little-endian storage

Big-endian (storage) mode: means that the low-order bits of data are stored in the high address of the memory, and the high-order bits of the data are stored in the low address of the memory;

Little endian (storage) mode: means that the low bits of data are stored in the low address of the memory, and the high bits of the data are stored in the high address of the memory.

To determine whether the byte order of the current machine is big endian or little endian, you can use the following diagram to determine

The following is the implementation code:

#include <stdio.h>
int check_sys()
{
 int i = 1;
 return (*(char *) & amp;i);
}
int main()
{
 int ret = check_sys();
 if(ret == 1)
 {
 printf("little endian\
");
 }
 else
 {
 printf("Big endian\
");
 }
 return 0;
}