Inability to read with the naked eye is the unique romance of binary – a blog to learn about file operations (C language)

Table of Contents

1. Why use files

2. What is a file?

2.1 Program files

2.2 Data files

2.3 Text files and binary files

2.4 File name

3. Opening and closing files

3.1 File pointer

3.2 Opening and closing files

3.3 Sequential reading and writing functions of files

3.3.1 The concept of flow

3.3.2 Concept of input and output

3.3.3 Function operations

3.4 Random reading and writing functions of files

3.4.1fseek

3.4.2 ftell

3.4.3 rewind

4. Determination of the end of file reading

4.1feof

4.2 Text files

4.3 Binary files

5. File buffer


1. Why use files

Why use files? When I first started learning file operations, I had this question. I just need to write the program, it doesn’t matter whether I save it to a file or not. So from the very beginning I had a quick-and-dirty mentality, “Just give it a symbolic listen” and “Let’s learn it later when we can use it”…

However, I suddenly changed my mind yesterday. How can I really save a few important contacts in the address book management system I wrote before? If I lose contact with them one day in the future, can I really find them using the program I wrote? When I actually put it into practice, I found that it really made sense.

When we use binary writing to a file, we cannot read it with the naked eye when opening the file normally. We can only understand it if we write a program specifically for reading binary. So I think:
Incomprehensible to the naked eye is a romance unique to binary.

Closer to home, when we want to actually save information to the computer, the data will no longer exist until we choose to delete the data.
This involves the issue of data persistence. Our general data persistence methods include storing data in disk files, storing data in databases, etc. But now we are learning to use files to store data directly on the computer’s hard drive. Achieved data persistence.

2. What is a file

Files on disk are files.
But in programming, we generally talk about two types of files: program files and data files (from the perspective of file function).

2.1 Program File

Including source program files (suffix .c), target files (suffix .obj in windows environment), and executable programs (suffix .exe in windows environment).

2.2 data file

The content of the file is not necessarily the program, but the data read and written when the program is running, such as a file that needs to read data from when the program is running, or a file that outputs content.

What we study here is mainly data files.

2.3 Text Files and Binary Files

Depending on how the data is organized, data files are called text files or binary files.

The data is stored in binary form in the memory. If it is output to external memory without conversion, it is a binary file.
If it is required to be stored in ASCII code on external storage, it needs to be converted before storage. Files stored in the form of ASCII characters are text files.

How is data stored in memory?
Characters are always stored in ASCII form, and numerical data can be stored in either ASCII or binary form.

For example, if there is an integer 10000, if it is output to the disk in the form of ASCII code, it will occupy 5 bytes on the disk (one byte for each character), while if it is output in binary form, it will only occupy 4 bytes on the disk (VS2013 test ).

2.4 file name

A file must have a unique file identifier so that users can identify and reference it.

The file name contains 3 parts: file path + file name stem + file suffix

For example: c:\code\test.txt

For convenience, the file identifier is often referred to as the file name

3. Opening and closing files

3.1 File Pointer

In the buffered file system, the key concept is the “file type pointer”, referred to as the “file pointer”.
Each used file opens up a corresponding file information area in the memory to store file-related information (such as the name of the file, file status and current location of the file, etc.).

This information is stored in a structure variable. The structure type is declared by the system and named FILE (Whenever a file is opened, the system will automatically create a variable of the FILE structure based on the file. and populate the information in it, we don’t have to care about the details)

Generally, the variables of this FILE structure are maintained through a FILE pointer, which makes it more convenient to use.

FILE* pf; //File pointer variable

In order to distinguish it from the regular pointer p, we define pf as a pointer variable pointing to FILE type data. You can make pf point to the file information area of a certain file (it is a structure variable). The file can be accessed through the information in the file information area. In other words,the file associated with it can be found through the file pointer variable.

3.2 File opening and closing

The file should be opened before reading and writing, and the file should be closed after use (this is the same as how we use files on the computer, right).
When writing a program, when opening a file, a FILE* pointer variable will be returned pointing to the file, which is equivalent to establishing the relationship between the pointer and the file. In the following document, we can use pf to control our document.

//open a file
FILE * fopen ( const char * filename, const char * mode );
//Close file
int fclose (FILE * stream);

/*use case*/
int main()
{
    //open a file
FILE* pf = fopen("test.txt", "r");

    //Judge whether it is empty
if (pf == NULL)
{
perror("fopen");
return;
}
//File related operations
//......
    
    //Close file
fclose(pf);
    pf = NULL;
return 0;
}

3.3 File Sequential Reading and Writing Function

3.3.1 The concept of flow

First of all, let’s first understand the concept of “stream”. “Stream” is a very abstract concept. The data stored in it flows to its corresponding fields through various “streams”, such as screen, hard disk, network… …

3.3.2 The concept of input and output

Secondly, in order to avoid ambiguity about the concepts of input and output when we learn functions, we first understand the input-output relationship between programs and files:

Because we interact with the keyboard and the program in the standard stream, and in the file stream, we operate the keyboard to make the program interact with the file, which will inevitably cause some ambiguities. For example, we think that the keyboard input to the C program is an input stream, but in fact it is The program is outputting to a file, which is an output stream…

3.3.3 Function Operation

Now, let’s formally learn the functions that can operate files:

td>

td>

Function Function name Applicable to strong>
Character input function fgetc All input streams
Character output function fputc All output streams
Text line input function fgets All input streams
Text line output function fputs All output streams
Format input function fscanf All input streams
Format output function fprintf All output streams
Binary input fread File
Binary output fwrite File

Among them, streams are divided into file streams and standard input (output) streams. The first six functions can transmit to both files and screens, while the last two functions can only transmit to files.

When we When we want to use these six functions to run the standard stream, we can just change FILE *stream to the corresponding stdout or stdin.

Let’s learn about the romance of binary: fwrite and fread functions!

The array pointed to by ptr here is created by ourselves. When outputting, we can put the content that needs to be written to the file in the array. When outputting, we store the contents of the file into an array and then output the array. Here is the address book I wrote as an example:

The format of the array of files we use to receive (the array pointed to by ptr) must be the same as the data format in the file in order to correctly read the data in the binary file. That is to say, if you do not know the data stored in the binary file at all, What it is, it is difficult for you to read it through fread.

Random reading and writing functions of 3.4 files

If you operate on files, the above functions alone are definitely not enough. Let’s introduce several other functions that allow us to randomly read the data in the file.

3.4.1fseek

Locate the file pointer based on its position and offset

origin:

Name Reference position
SEEK_SET File starting position

td>

SEEK_CUR Current position of the file (cursor positioning)
SEEK_END

file end position

We can use fseek to specify the position of our cursor. When we next write a file, the position of the cursor will not change.

When we use the previous fgetc, we fetch character by character, and the cursor position is positioned in front of a certain character. This is the position of SEEK_CUR.

Let’s look at an example below:

3.4.2 ftell

Returns the offset of the file pointer relative to the starting position

Same location as SEEK_CUR.

3.4.3 rewind

Return the position of the file pointer to the beginning of the file

4. Determination of the end of file reading

4.1 feof

When the file reading ends, determine whether the reading failed or the end of the file was encountered

The return value of the feof function cannot be directly used to determine whether the file is ended!

4.2 Text File

Determine whether the return value is EOF (fgetc), or NULL (fgets)

At this time, if we want to read the file, we can use a while loop, similar to the previous multiple sets of input.

4.3 Binary File

Binary file reading end judgment, judge whether the return value is less than the actual number to be read

This is why I used a for loop to traverse the entire file earlier, I need an accurate value to completely traverse the binary file.

Five. File Buffer

The ANSIC standard uses a “buffer file system” to process data files. The so-called buffer file system means that the system automatically creates a “file buffer” in memory for each file being used in the program.

The data output from the memory to the disk will be sent to the buffer in the memory first, and then the buffer will be filled and then sent to the disk together.

If data is read from the disk to the computer, the data is read from the disk file and input into the memory buffer (the buffer is filled), and then the data is sent from the buffer to the program data area (program variables, etc.) one by one. The size of the buffer is determined by the C compilation system.

Code test:

#include <stdio.h>
#include <windows.h>
int main()
{
FILE* pf = fopen("test.txt", "w");
fputs("Yeyeshiningjingjing", pf);//First put the code in the output buffer
printf("Sleep for 10 seconds - data has been written. Open the test.txt file and find that the file has no content\\
");
Sleep(10000);
printf("Refresh buffer\\
");
fflush(pf);//When the buffer is refreshed, the data in the output buffer is written to the file (disk)
printf("Sleep for another 10 seconds - at this time, open the test.txt file again, the file has content\\
");
Sleep(10000);
fclose(pf);
//Note: fclose will also refresh the buffer when closing the file.
pf = NULL;
return 0;
}

The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge C Skill treeFileBasic operations of files 191546 people are learning the system