Linux buffer/disk/inode

1. Buffer

1. What is a buffer

The essence of the buffer: it is a piece of memory.

2. Why is there a buffer zone

  • Free up buffer process time
  • The existence of the buffer can centrally process data refresh and reduce the number of IOs, thereby achieving the purpose of improving the efficiency of the entire machine.

3. Buffer refresh strategy

  • Refresh immediately (no buffering)–ffush()
    There are very few situations, such as calling fflush manually to refresh the buffer after calling printf.
  • Row Refresh (Line Buffering) – Display
    The display needs to meet people’s reading habits, so a row refresh strategy is adopted instead of a full buffering strategy. Although the fully buffered refresh method can greatly reduce the number of data IOs and save time. However, if the data is temporarily stored in the buffer and then flushed out when the buffer is full, it will be difficult not to be confused when a person is faced with a large amount of data appearing on the screen when reading. Therefore, the display adopts a row refresh strategy, which not only ensures people’s reading habits, but also prevents data IO efficiency from being too low.
  • Refresh after buffer is full (full buffer) – Disk file
  • Special refresh situations
    User forces refresh or process exits

4.FILE

Because IO-related functions correspond to system call interfaces, and library functions encapsulate system calls, in essence, files are accessed through fd. Therefore, the FILE structure in the C library must encapsulate fd.

4.1. Buffer and C library functions

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <string.h>
int main()
{
    const char *str1 = "hello printf\
";
    const char *str2 = "hello fprintf\
";
    const char *str3 = "hello fputs\
";
    const char *str4 = "hello write\
";

    //C library function
    printf(str1);//Print to stdout buffer first
    fprintf(stdout,str2);
    fputs(str3,stdout);
 
    //system interface
    write(1,str4,strlen(str4));

    //The fork is executed after calling the above code.
    fork();//Generate child process
    return 0;
}

This phenomenon is related to the buffer. It shows that the buffer does not exist in the kernel, otherwise write will print twice. The buffer provided by the user-level language level is in stdin/stdout/stderr pointed to by FILE*. The FILE structure will contain fd and buffer. When forced refresh is required, call fflush(FILE*); when closing the file, call fclose(FILE*). The parameter FILE* is to refresh the buffer in the FILE structure pointed to by FILE*.

We found that printf fprintf and fputs (library function) both output 2 times, while write only output once (system call). why? It must be related to fork!

  • Generally, C library functions are fully buffered when writing to files, while writing to the display is line buffering.
  • printf fprintf and fputs library functions have their own buffers (the progress bar example can illustrate this). When redirection to a normal file occurs, the data buffering method changes from line buffering to full buffering.
  • The data we put in the buffer will not be refreshed immediately, even after fork
  • But after the process exits, it will be refreshed uniformly and written to the file.
  • But when forking, the parent-child data will be copied on write, so when your parent process is ready to refresh, the child process will also have the same data, and then two copies of data will be generated.
  • There is no change in write, indicating that there is no so-called buffering.

To sum up: the printf fprintf and fputs library functions have their own buffers, but the write system call does not have a buffer. In addition, the buffers we are talking about here are all user-level buffers. In fact, in order to improve the performance of the entire machine, the OS will also provide relevant kernel-level buffers, but this is beyond the scope of our discussion. Who provides this buffer zone? printf, fprintf and fputs are library functions, and write is a system call. The library function is in the “upper layer” of the system call and is the “encapsulation” of the system call. However, write does not have a buffer, while printf, fprintf and fputs do. It is enough to show that the buffer The area is added twice, and because it is C, it is provided by the C standard library.

4.2. Imitation line buffering

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <fcntl.h>
#include <assert.h>

#define NUM 1024

#define NONE_FLUSH 0x0
#define LINE_FLUSH 0x1
#define FULL_FLUSH 0x2

typedef struct _MyFILE{
    int _fileno;
    char _buffer[NUM];
    int _end;
    int _flags; //fflush method
}MyFILE;

MyFILE *my_fopen(const char *filename, const char *method)
{
    assert(filename);
    assert(method);

    int flags = O_RDONLY;

    if(strcmp(method, "r") == 0)
    {}
    else if(strcmp(method, "r + ") == 0)
    {}
    else if(strcmp(method, "w") == 0)
    {
        flags = O_WRONLY | O_CREAT | O_TRUNC;
    }
    else if(strcmp(method, "w + ") == 0)
    {}
    else if(strcmp(method, "a") == 0)
    {
        flags = O_WRONLY | O_CREAT | O_APPEND;
    }
    else if(strcmp(method, "a + ") == 0)
    {}

    int fileno = open(filename, flags, 0666);
    if(fileno < 0)
    {
        return NULL;
    }

    MyFILE *fp = (MyFILE *)malloc(sizeof(MyFILE));
    if(fp == NULL) return fp;
    memset(fp, 0, sizeof(MyFILE));
    fp->_fileno = fileno;
    fp->_flags |= LINE_FLUSH;
    fp->_end = 0;
    return fp;
}

void my_fflush(MyFILE *fp)
{
    assert(fp);

    if(fp->_end > 0)
    {
        write(fp->_fileno, fp->_buffer, fp->_end);
        fp->_end = 0;
        syncfs(fp->_fileno);
    }

}


void my_fwrite(MyFILE *fp, const char *start, int len)
{
    assert(fp);
    assert(start);
    assert(len > 0);

    //abcde123
    //Write into buffer
    strncpy(fp->_buffer + fp->_end, start, len); //Write data to the buffer
    fp->_end + = len;

    if(fp->_flags & amp; NONE_FLUSH)
    {
        
    }
    else if(fp->_flags & amp; LINE_FLUSH)
    {
        if(fp->_end > 0 & amp; & amp; fp->_buffer[fp->_end-1] == '\
')
        {
            //Just write to the kernel
            write(fp->_fileno, fp->_buffer, fp->_end);
            fp->_end = 0;
            syncfs(fp->_fileno);
        }
    }
    else if(fp->_flags & amp; FULL_FLUSH)
    {

    }
}

void my_fclose(MyFILE *fp)
{
    my_fflush(fp);
    close(fp->_fileno);
    free(fp);
}

int main()
{
    MyFILE *fp = my_fopen("log.txt", "w");
    if(fp == NULL)
    {
        printf("my_fopen error\
");
        return 1;
    }

    //const char *s = "hello my 111\
";
    //my_fwrite(fp, s, strlen(s));

    //printf("Message is refreshed immediately");
    //sleep(3);

    //const char *ss = "hello my 222";
    //my_fwrite(fp, ss, strlen(ss));
    //printf("A string that does not meet the refresh conditions is written\
");
    //sleep(3);

    //const char *sss = "hello my 333";
    //my_fwrite(fp, sss, strlen(sss));
    //printf("A string that does not meet the refresh conditions is written\
");
    //sleep(3);


    //const char *ssss = "end\
";
    //my_fwrite(fp, ssss, strlen(ssss));
    //printf("A string that satisfies the refresh condition is written\
");
    //sleep(3);


    const char *s = "-aaaaaaa";
    my_fwrite(fp, s, strlen(s));
    printf("A string that does not meet the refresh conditions was written\
");
    fork();

    //Simulation process exits
    my_fclose(fp);
}

5. Standard output and standard error

5.1. Standard output and standard error redirection

#include <iostream>
#include <cstdio>
#include <errno.h>

int main()
{
   
    //stdout
    printf("hello printf 1\
");
    fprintf(stdout, "hello fprintf 1\
");
    fputs("hello fputs 1\
", stdout);

    //stderr
    fprintf(stderr, "hello fprintf 2\
");
    fputs("hello fputs 2\
", stderr);
    perror("hello perror 2");


    //cout
    std::cout << "hello cout 1" << std::endl;

    //cerr
    std::cerr << "hello cerr 2" << std::endl;
}

[abc@aly 0820]$ ./a.out 1> stdout.txt 2>stderr.txt

We made two redirection plans. The first time was to redirect the standard output to our monitor’s standard output file. The second time, standard error is redirected to the corresponding monitor standard error file. Then, it is called printing our contents of Xiang 1 and Xiang 2 separately. The greatest significance of this is that it can distinguish: which are the daily output of the program and which are errors.

[abc@aly 0820]$ ./a.out > all.txt 2> & amp;1

What this command means is that if it were typed separately, it would mix everything together. The first thing is to do redirection, so first a new file called all.txt will be opened, and then its corresponding file description is 3. Inside the program, when it is doing redirection analysis, it redirects the content that should have been displayed in 1 to the file all.txt, so all the standard input content of a.out is written to this file. In the second half, you can understand that the achievement is to copy the content in 1 to 2. What’s the meaning? Then copy 1 in. After copying in, 2 originally pointed to the monitor. Now copy 2 and point 2 to all.txt, so they both point to the same file. In the end, they will all point to this file. transfer.

5.2.perror

C language has a global variable that records the reason for the failure of the latest C library function call! errno

Imitation perror

#include <iostream>
#include <cstdio>
#include <cstring>
#include <errno.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <fcntl.h>

void my_perror(const char *info)
{
    fprintf(stderr, "%s: %s\
", info, strerror(errno));
}

int main()
{
   //fopen: C library function
    int fd = open("log.txt", O_RDONLY); //must fail
    if(fd < 0)
    {
        //perror("open");
        my_perror("my open");
        return 1;
    }

   
}

How perror works: When an error occurs in a system call or library function, the global variable errno is usually set to a specific error code . The perror function reads the value of errno and generates a corresponding error description based on this value. The error description is then concatenated with the passed string parameters and output to the standard error stream.

2. File system

1. Physical structure of disk


The disk is the only mechanical structure in the computer and is a peripheral device, which is slower than other storage devices. However, its low price and large storage capacity make it the first choice for enterprise storage equipment. The distance between the disk head and the disk surface is extremely close, and dust cannot enter. During use, it is prohibited to move, shake, or scratch the disk surface, causing data loss. The magnetic disk charges and discharges through the magnetic head to complete the rotation of the north and south poles of the disk, that is, the writing of binary data.

2. Storage structure of disk

When addressing a disk, the basic unit is a sector (512 bytes). As shown in the figure, the blue part is the sector. The area of the sector closer to the concentric circles is smaller, and the area of the concentric circles farther away from the sector is larger. However, the storage size of each sector is 512 bytes.

How to locate sectors on the disk: confirm which track it is on by swinging the magnetic head, and position the sector by rotating the disk at high speed. (Disk manufacturers will match the disk rotation speed with the head addressing speed, so the faster the disk rotation speed, the higher the IO efficiency of the disk.)

All the heads of the disk advance and retreat together, so how to locate a sector in the disk: Locate a sector in the disk, and the hardware positioning method uses the CHS positioning method. 1. First locate the cylinder (cylinder) 2. Position the head (disk surface) 3. Position the sector.

3. Logical abstract structure of disk

3.1CHS addressing method

In the past, hard disk capacity was relatively small. People used the design structure of floppy disks to design and produce hard disks. Each track of the hard disk platter had the same number of sectors, which gave rise to CSH 3D parameters (Disk Geomentry).
That is, the number of heads (Heads), the number of cylinders (Cylinders) and the number of sectors (Sectors), and the corresponding CHS addressing mode.

CHS addressing mode divides the hard disk into three parts: Heads, Cylinder, and Sector.

  • Magnetic head: There is a magnetic head on the front and back of each disk. One magnetic head corresponds to one surface of the disk. Therefore, the number of magnetic heads can indicate which disk surface the data is on.
  • Cylinder: It is composed of concentric tracks with the same radius in all disks. The water quality of this series of tracks is stacked together to form a cylinder shape. So – number of cylinders = number of tracks.
  • Sector: The disk is divided into several small segments. Although each sector is small, it looks like a “fan”, so it is called a sector. The capacity of each sector is 512 words. Festival.

The maximum capacity of CHS addressing is determined by three parameters of CHS:

  • The maximum number of heads is 255, stored in 8 binary bits, and numbered starting from 0
  • The maximum number of cylinders is 1023, stored in 10 binary bits, and numbered starting from 0
  • The maximum number of sectors is 63, stored with 6 binary bits, numbered starting from 1

Therefore, the maximum addressing range of CHS addressing mode is as follows:

\frac{255\times1023\times63\times512}{1048576}=7.837\,\textrm{GB}\,\,(1M=1048576\textrm{\,Bytes})

3.2. Logical sectors (LBA)

The maximum value of the three-dimensional address CHS can only be /1024/12/63

The capacity limit can only reach 1024×16×63 Byte=52842304Byte=504M
1024×16×63Byte=52842304Byte=504M.

Secondly, recording tedious CHS when managing files in the system is laborious and less effective.

However, after using logical sectors (LBA), you can get rid of the restrictions of hardware parameters such as cylinder and head during disk read and write operations.

Logical sectors are set up to facilitate the operating system to read and write hard disk data. Its size and specific address can be matched with the physical address through a certain formula. The operating system can read and write data based on the LBA without caring about how the physical addresses are mapped.

In LBA mode, the operating system can treat all physical sectors as linearly numbered sectors according to a certain method or rule, arranged from 0 to a certain maximum value, and connected into a line. Looking at the LBA as a whole, rather than specific to the actual CHS value, only an ordinal number is needed to determine the unique physical sector, which is the origin of linear addressing. Obviously, the linear address is the logical address of the physical sector.

3.3. Basic unit of disk

The basic unit of disk is: sector (normally 512 bytes)
The basic unit of file system access to disk is: 4KB

① Improve IO efficiency
② Do not let the software (OS) design and hardware (disk) have a strong correlation, that is, decoupling.

The overall IO efficiency is improved. It does not take much time to copy the disk data to the memory. What costs more is the process of finding the location inside the disk. This process is the addressing process. No matter how many revolutions the disk makes, you will never be able to match the photoelectric signal.

3.4. What is decoupling?

Decoupling refers to reducing the close correlation between two or more systems, components or modules so that they can evolve independently without affecting each other. In software development, decoupling is a design principle that aims to reduce the dependencies between components of a system, thereby improving the maintainability, flexibility, and scalability of the system.

The purpose of decoupling is to reduce the degree of interdependence between components in the system so that they can be modified, updated, or replaced independently without affecting other components. When components in a system are highly coupled, modifications to one component may trigger a chain reaction on other components, making the system unstable and difficult to maintain. The decoupling design can make each component more independent, reduce the complexity of the system, and improve the flexibility and scalability of the system.

How to manage files becomes the management of a group’s data.

4.inode

4.1.Basic understanding of inode

In order to explain inode clearly, let us first briefly understand the file system.

Linux ext2 file system, the picture above is the disk file system diagram (the kernel memory image must be different), the disk is a typical block device, and the hard disk partition is divided into blocks. The size of a block is determined during formatting and cannot be changed. For example, the -b option of mke2fs can set the block size to 1024, 2048 or 4096 bytes. The size of the Boot Block in the picture above is determined.

  • Block Group: The ext2 file system will be divided into several Block Groups according to the size of the partition. Each Block Group has the same structural composition. Examples of government management of various districts
  • Super Block: Stores the structural information of the file system itself. The recorded information mainly includes: the total amount of bolts and inodes, the number of unused blocks and inodes, the size of a block and inode, the most recent mount time, the most recent data writing time, and the most recent disk check time. and other file system related information. The information in the Super Block is destroyed, and it can be said that the entire file system structure is destroyed.
  • Group Descriptor Table (GDT): block group descriptor, describing block group attribute information
  • Block Bitmap: Block Bitmap records which data block in the Data Block has been occupied and which data block has not been occupied.
  • inode bitmap: each bit indicates whether an inode is free and available
  • inode table: save inode attributes in units of 128 bytes. There is an inode number in the inode attribute (the inode number is unique). Generally speaking, a file has an inode number.
  • i-node table: stores file attributes such as file size, owner, last modification time, etc.
  • Data area (Data blocks): The file content is saved in blocks.

The inode will save the attributes of the file, which is a fixed length.

4.2. How does an inode (file, attribute) associate with its own content?

* data block, 4 kb, can also save the numbers of other blocks!

//struct inode includes all attributes of the file:
struct inode {
    //All attributes of the file
    block[15];
};

In block[15], what is directly stored in [0]-[11] is the block number corresponding to the file.
[12]-[15] directly point to certain blocks in the corresponding data blocks, so that certain contents corresponding to the file can be found. Points to a datablock, but this datablock does not save valid data, but saves the numbers of other blocks used by the file!
An inode can be associated with multiple blocks.

The file name counts as an attribute of the file, but the file name is not stored in the inode.
Under Linux, the bottom layer actually identifies files through inode numbers.

To find the file, you must find the inode number of the file.

How does the 4.3 operating system find inodes?

Directories are files, files = content + attributes, and the file attributes of the directory also have their own corresponding inodes
Creating a file requires the w permission. To view the file name, the permission we need is  r permissions.
The directory file also has its own inode and its own data block. Its data block contains the mapping relationship between the file name and the inode! For us, a directory is also a file and has its own attributes. The directory file stores the directory file and inode number.

File name: mapping relationship of inode numbers. The file name and inode number are data that are ultimately stored in the directory contents.

Linux cannot create multiple files with the same name in the same directory. Therefore, the file name itself is a thing with a key value, which is a one-to-one relationship.

4.4. What does the operating system do when creating a file?

When we create a file, it must be in a directory. File name inode number -> Find the directory you are in data block -> Write the mapping relationship between the file name and inode number to the data block of the corresponding directory.

This also explains why the file name is not saved in the inode, because the file name is in the directory.

4.5 If you know the name of the directory you are in, can you know the inode of the directory?

If we want to know the inode of a directory, we need to go to the parent directory to find the corresponding relationship. Therefore, if you know the directory you are in, you cannot know the inode of the directory.

4.6. When a file is deleted, what does the operating system do?

Find the data block of the directory, which contains the mapping relationship of the file name Inode. The file name is uniquely searched in the directory, and the search is performed based on the file name. Find the corresponding file name and the corresponding entry of the Inode in this directory, find the Inode number, and then find the Block group it corresponds to according to the Inode number, and then the operating system will Bitmap the inode corresponding to the file Set the Block Bitmap corresponding to this file from 1 to 0 with the corresponding data block from 1 to 0. At this time, the file deletion is completed.

If we delete the file, we can restore the file. If we want to restore the file, we only need to get the deleted inode value. With some tools, you can restore the bitmap from 0 to 1.

When we delete a file, the operating system actually deletes the directory entry for the file in the file system’s directory structure and decrements the number of links in the file’s inode node by one. If the link count becomes 0, the file’s data blocks are freed and the inode is marked as available. However, deleting a file does not mean that the file’s data is immediately cleared, as the file may still be used or open by other processes or by the operating system itself. Therefore, the file’s data will only be completely cleared if all link counts for the file are 0. In addition, in some cases, the operating system may also use some special tools to overwrite the file’s data to ensure that the file content is unrecoverable. This type of overwriting is called safe deletion or clean deletion.

The biggest difficulty in recovery: the files have been deleted, how do you know what the inode is? In order to support recovery, the Linux system will save the inode number in the system log file. Recovery is a bit difficult!

In fact, Windows is also like this. Almost all file systems will not actually delete files when they are deleted.