The concept of file buffer and the storage principle of disk, as well as soft and hard links

Directory

buffer zone

Buffer Execution Concept

The C language buffer exists in the FILE structure

Policy for flushing user buffers to OS buffers

A redirect occurs

redir is not redirected.

redir redirection.

Disk storage principle

Elaborate:

Inode table

Date block

Inode bitmap

Block bitmap

Group Descriptor Table

super block

Run the cat program in the directory

Soft and hard links

Soft and hard link drawing understanding

?edit

Relevant knowledge points:


Buffer

Buffer Execution Concept

When the system file is written, there are buffers in both the user layer and the system layer, and the system layer loads the file into the system buffer and then writes it to the disk file, while the user layer writes the file by writing The input data is first put into the user cache, and then through the system call interface, the user buffer data is loaded into the system buffer (selected by fd)

C language buffer exists in FILE structure

So we write to the user buffer in printf, fprintf, fputs, etc.

Strategy for flushing user buffer to OS buffer

User -> OS Refresh Policy

1. Refresh immediately: use fflush(FILE*)

2. Line refresh (line buffer\\
): monitor print

3. Refresh when the buffer is full (full buffer): write data to the disk file; (one-time printing at the end of the function: fclose->fflush + close() is abstract);

Redirect occurred

Occurs when a string that would have been printed on the display is printed to a file

Display -> File

line buffering -> full buffering

int main()
{
   //close(1);
   const char*mgs1="==standard input==\\
";
   write(1, mgs1, strlen(mgs1));
 
 
   const char*mgs2="==standard error==\\
";
   write(2, mgs2, strlen(mgs2));
 
   printf("hello world\\
"); 18
   fprintf(stdout,"wdsj\\
");
   close(1);
   return 0;
}

run code

Let’s redirect writes to log.txt

The writing of our printf and fprintf is invalid, neither in the display nor in the file, what’s the situation?

Due to the redirection to the file, the original printing strategy was line buffering, and then changed to full buffering printing in the file.

The data printed by our printf and fpantf is not printed on the display, but loaded into the user buffer

Then we close the file log.txt early (bash’s redirection changes stdout to write to log.txt)

As a result, the file buffer in fd=1 cannot be loaded into the OS file buffer When the process finishes refreshing the user buffer, the process cannot refresh the data in the user buffer to the file kernel buffer, and then write the file.

Note that a FILE pointer is a buffer, and the buffers are not shared.

But why is write written to the file? Isn’t the file fd closed? Because wirte is a system function, it can be directly written to the file kernel buffer without going through the user buffer, so when we close fd=1 in advance, it will not affect the data in the kernel buffer.

At the end of the process (main’s return or exit()), the user buffer and OS buffer are refreshed. FD is closed, which affects the refreshing of user buffer data to the kernel buffer, but does not affect the refreshing of kernel data to disk.

redir is not redirected.

When printing to the display, the display uses the principle of line buffering. When \\
is encountered, it will be immediately refreshed from the user buffer to the file buffer, and the data will be printed on the display. So close fd:1, the process is not affected.

Whereas if you close fd early: 1

The corresponding display struct file structure cannot be accessed through the address corresponding to the fd:1 subscript.

But it does not affect fd: 2 access to the display file structure! !

redir redirection.

Strings written to the system buffer will be written to disk immediately, whether written directly by flushing or not

This is my experiment, if there is a mistake, please leave your valuable comments.

Full buffering: C language functions: printf, fprintf stream data is saved in the user buffer, no matter whether there is ‘\\
‘, it will stay in the user buffer, waiting for active refresh or the normal end of the program to refresh the buffer area into the file buffer.

However, the write system interface does not need to be loaded into the user buffer, but directly loaded into the corresponding file buffer, and the current writing is completed and directly written to the disk.

run code

Finally, closing fd: 1 does not affect the system interface write, which means that the system interface writes data to file before closing, but the C language interface does not write to any place. It is verified that the C language has a buffer. After redirection ” \\
” will not be able to actively refresh the data to the file buffer, and after we close the file descriptor of the process fd:1, the process will not be able to refresh the data to the corresponding file according to fd:1 after the process ends.

Disk storage principle

We imagine the disk as a super large array

Divide each area of the array, zoom in to area 3 to view the details

boot block: The initial stage of the area, which means the beginning of joining the 3rd area, until the next boot block appears, this space is the content of the 3rd area. If we manage the 3 areas well, we can manage other areas similarly to the type of replication.

Continue to divide the 3 areas, divide n block groud areas, and zoom in to observe block groud 0;

super block: contains all block block information, the block information of the same area is the same (block ground 0~n, same as super block);

Group Descriptor Table: saves the usage information of the current block group

block bitmap: Save the used and unused Date block positions of the current group block, use binary identification, 0 indicates unused, 1 indicates used.

Inode bitmap: Save the current unused and used Inode table information location, the same as the block bitmap, it is also a binary identification of used and unused.

Inode table: It stores data like a structure one by one. It stores the attribute information of the file such as: file permission, file size, file date, and also saves the address information of the corresponding Date block.

Date block: It stores data like a structure. It stores the content of the file, such as: the code written in the C source file, the information of the picture, etc.

Details:

Inode table

It is also similar to a large array (maybe a linked list, which does not affect the conceptual understanding) to store structure data, like a desktop, with multiple boxes (structure data), which store file attributes (each box has its own unique Inode number, the only range here includes the Inode table data of the entire disk)

Date block

It is also similar to a large array (maybe a linked list, which does not affect the conceptual understanding) to store structure data, and also like a desktop, with multiple boxes (structure data), which store the content of the file, according to the date saved by the Inode The corresponding data box found by the block address may have an array member in the Inode box structure to store multiple Date block data, or the structure may have a pointer pointing to this Date block and other Date block chained storage links. We do not get to the bottom.

Inode bitmap

Similar to the roster, view the distribution of the used space and unused space of the Inode Table. Using binary 0s and 1s

0 means the location is not used, 1 means it is used.

Block bitmap

It is also the same roster as the Inode bitmap.

Group Descriptor Table

Save information bit current Group block Date block and Inode Table usage and other data.

super Block

to store basic file system class data. Save the usage of all Block groups in the entire current boot block. Each Block group has a copy. Although it is redundant, if other Block groups make mistakes, you can find out the relevant information of the area based on other blocks in the same area.

inode: similar to each file has an inode number, similar to a person’s ID card

Every file on disk has a unique Inode. This cannot be changed.

Run the cat program in the directory

First of all, we know that all the files or directory files we create are created in one directory. The data stored in the directory is actually the correspondence between the files in the directory and their Inodes.

Well, keep this concept in mind

What is the logic of this code operation? (We will not talk about the operation of the cat file) First, according to the Inode value of the current directory (.) –> the Inode structure of the directory –> the Date group of the directory –> find the Inode corresponding to the proc.cc file name under the Date group- -> Then find the corresponding Inode structure according to the Inode value of proc.cc –> Find the data stored in the Date group according to the Inode structure –> Print it in the display screen file through cat.

Soft and hard links

With the basic logic above, let’s see what are soft and hard links

Let’s build a soft link to the executable

When we execute rlink, we can also execute the program

Let’s build a hard link to the executable

headlink can also run that code, so what’s the difference?

Execute the command ls -i -l

It will be found that the Inode of headlink and test are the same, but the Inode of rlink is different from the code of rlink.

Because I said earlier that an Inode represents the number of a file, so a soft link is actually to generate a file, while a hard link is to alias the file, similar to a language-level reference.

When we delete the executable

Hard links can still execute programs, but soft links can no longer execute programs. Why is this? Let’s draw a diagram to understand the following

The file name of the hard link in another directory and the test use the same Inode, while the soft link and the executable program have different Inodes.

So the soft link is actually to create a new file in the directory and save the relative path of the executable program, while the hard link is to add a file name in the directory, but it has the same Inode as the executable program, and the accessed file is actually the same. of.

When we delete test

In fact, delete the name of the test file under the file and its corresponding Inode, but the headlink is still linked to the Inode, so the Inode bitmap will not be set to 0, and the corresponding block bitmap will not be set to 0, so the hard link can still execute the program. But the soft link is to run the program through the executable file name, once the file name is deleted, the soft software will lose the link relationship! ! !

Therefore: the soft link file only saves the file name in the directory corresponding to the link file, and does not report an error file Inode

: The hard link file is a copy of his Inode file name, the name is different but the Inode is the same.

Soft link:

  • 1. Soft links exist in the form of paths. Similar to shortcuts in the Windows operating system
  • 2. Soft links can cross file systems, but hard links cannot
  • 3. A soft link can link to a file name that does not exist
  • 4. Soft links can link to directories

Hard link:

  • 1. Hard links exist in the form of file copies. But takes no real space.
  • 2. It is not allowed to create hard links to directories
  • 3. Hard links can only be created in the same file system

The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge Cloud native entry skill treeHomepageOverview 12885 people are learning systematically