[C] file operation (fopen and other functions)

File operation

    • 1. Why use files
    • 2. What is a document
      • 2.1 Program files
      • 2.2 Data files
      • 2.3 File name
    • 3. Opening and closing of files
      • 3.1 File Pointer
      • 3.2 Opening and closing of files
    • 4. Sequential reading and writing of files
      • fputc
      • fgetc
      • fputs
      • fgets
      • fscanf
      • fprintf
      • fread
      • fwrite
    • 5. Random reading and writing of files
      • fseek
      • ftell
      • rewind
    • 6. Text files and binary files
    • 7. Judgment of the end of file reading
      • feof
    • 8. File buffer

1. Why use files

In the code we usually write on vs, all the data after the program runs is stored in the memory. When the program exits, all the data we input through the keyboard disappears. After we execute the program again, You have to type from the keyboard again, which is very troublesome when the program has a lot of data.

So if we want to keep the data we entered, the data will only be deleted when we choose to delete the data.

This involves the problem of data persistence. Our general data persistence methods include: storing data in disk files, storing data in databases, and so on.

Using files, we can store data directly on the hard disk of the computer, achieving data persistence.

2. What is a file

Files on disk are files.
For example, the files in our C drive and D drive, the picture files with the suffix .png, the suffix .exe, etc.

But in program design, we generally talk about two types of files: program files and data files (classified from the perspective of file functions).

2.1 Program files

Including source program file (suffix .c), object file (windows environment suffix .obj), executable program (windows environment suffix .exe).

For example, I am now writing a bubble sort. I created a test.c, a bubbleSort.c, and a bubbleSort.h. These three files are all program files. An executable program of test.exe is automatically created in the corresponding directory, and these are program files.

2.2 Data files

The content of the file is not necessarily the program, but the data read and written when the program is running, such as the file from which the program needs to read data, or the file that outputs the content.

Following the above, I created a data.txt file in the corresponding directory, and then I can read (input) and write (output) data to data.txt through test.exe.

In the past, when we did not create a special read-write file, the input and output of the processed data were all targeted at the terminal, that is, the data was input from the keyboard of the terminal, and the running results were displayed on the monitor.

In fact, sometimes we will output the information to the disk, and then read the data from the disk into the memory for use when needed. Here, the files on the disk are processed.

2.3 File name

A file must have a unique file identifier for user identification and reference.
The file name consists of 3 parts: file path + file name trunk + file suffix
For example: c:\code\mycode\test.txt
Here c:\code\mycode\ is the file path, test is the backbone of the file name, and .txt is the file suffix.

Files cannot contain these characters: / \ * ? ” < > | :
The suffix of the file determines the default opening method of a file
The file path refers to the collection of symbol names in the path from the drive letter to the file

For convenience, the file ID is often referred to as the file name.

3. File opening and closing

3.1 File pointer

In the cache file system, the key concept is “file type pointer”, referred to as “file pointer”.

Each used file has opened up a corresponding file information area in the memory, which is used to store the relevant information of the file (such as the name of the file, the status of the file and the current location of the file, etc.). This information is stored in a structure variable. The structure type is declared by the system, named FILE.

Below we can create a FILE* pointer variable:

FILE* pf;//File pointer variable

What is the file information area?
Look at the diagram:

The pf pointer points to the file information area, and then we can read and write the data.txt file through the file information.

The contents of the FILE type of different C compilers are not exactly the same, but they are similar.
For example, the following is the file type declaration provided by vs2013.

struct _iobuf
{<!-- -->
char *_ptr;
int _cnt;
char *_base;
int_flag;
int_file;
int _charbuf;
int _bufsize;
char *_tmpfname;
};
typedef struct _iobuf FILE;

In fact, this is a structure, but its use is unique.

Whenever a file is opened, the system will automatically create a variable of the FILE structure according to the situation of the file, and fill in the information, and the user does not need to care about the details. Generally, the variables of this FILE structure are maintained through a FILE pointer, which is more convenient to use.

Define pf as a pointer variable pointing to FILE type data. You can make pf point to the file information area of a certain file (it is a structure variable). The file can be accessed through the information in the file information area. That is to say, the file associated with it can be found through the file pointer variable.

3.2 Opening and closing of files

Files should be opened before reading and writing, and should be closed after use.

Then the general operation of the file is:

  1. open a file
  2. File operations (read/write)
  3. close file

When writing a program, when opening a file, a FILE* pointer variable will be returned to point to the file, which is equivalent to establishing the relationship between the pointer and the file.

ANSIC stipulates that the fopen function is used to open the file, and the fclose function is used to close the file

fopen

FILE *fopen( const char *filename, const char *mode );

The first parameter of the fopen function is the file name, and the second parameter is the opening method.

fclose

int fclose ( FILE * stream )

The parameter of the fclose function is the file name.

How to open the file:

“r” (read-only) Opens an existing text file for data entry
error if the specified file does not exist

“w” (write only) Opens a text file for data output
Create a new file if the specified file does not exist

“a” (append) Adds data to the end of the text file
Create a new file if the specified file does not exist

“rb” (read-only) Opens a binary file for data input
error if the specified file does not exist

“wb” (write only) Opens a binary file for data output
Create a new file if the specified file does not exist

“ab” (append) appends data to the end of a binary file
error if the specified file does not exist

“r +” (read and write) opens a text file for reading and writing
error if the specified file does not exist

“w +” (read and write) suggests a new file for both reading and writing
Create a new file if the specified file does not exist

“a +” (read and write) open a file, read and write at the end of the file
Create a new file if the specified file does not exist

“rb +” (read-write) opens a binary file for reading and writing
error if the specified file does not exist

“wb +” (read and write) creates a new binary file for reading and writing
Create a new file if the specified file does not exist

“ab +” (read and write) opens a binary file for reading and writing at the end of the file
Create a new file if the specified file does not exist

In fact, just remember the above
Open as “r”, if there is no file, an error occurs. open if file
Open with “w” and “a”. If there is no file, a new file will be created automatically, and if there is a file, it will be opened.

Let’s take an example:

4. Sequential reading and writing of files

Sequential reading and writing means that whether you input or output, you read and write sequentially according to the contents of the file.
Sequential reading and writing have these functions:

character input function
fgetc works on all input streams

character output function
fputc works on all output streams

text line input function
fgets works on all input streams

text line output function
fputs applies to all output streams

format input function
fscanf works on all input streams

format output function
fprintf works on all output streams

binary input
fread works on files

binary output
fwrite applies to files

The input and output stream is mentioned above, let’s briefly talk about it here.
First of all, flow is an abstract concept
“Flow” means flow, the process of material flowing from one place to another, and an abstract description of orderly, continuous and directional data.
In a computer system, it refers to the process of inputting information from an external input device to the computer, or outputting information from memory to an external output device. This process of input and output is vividly compared to “flow”.

We read and write data by streams, for example, we can read and write from files/screens/networks/printers or other external devices.

The stream we read and write files is the file stream. We can write data to the file, and we can also read data from the file, and these reads and writes are all from the perspective of memory.

There is also a stream called the standard input/output/error stream.
The standard input stream (stdin) is our keyboard.
The standard output stream (stdout) is our screen.
The standard error stream (stderr) is also screen.
After the C program is executed, these three streams will be opened by default. This is why after we execute the program, there is no need to open the keyboard or open the screen.

How to use the above function?
Let’s look at them one by one:

fputc


The function of this function is to write characters to the stream and advance the position indicator.
The bit pointer refers to the current position of our pointer in the stream.

The above is the file stream, and the bit indicator is the position of a.

fgetc

This probably means:
get characters from stream
Returns the character currently pointed to by the specified stream’s internal file position indicator. The internal file position indicator will then advance to the next character.
If the stream is at end-of-file when called, the function returns EOF and sets (feof) the end-of-file indicator for the stream.
If a read error occurs, the function returns EOF and sets the error indicator (ferror) for the stream.

The actual value of EOF is -1.

Let’s see how to use it.

fputs


The function of this function is to write the string to the stream.

This function starts copying from the specified address (str) until reaching the terminating null character (‘\0’). This terminating null character is not copied to the stream.

fgets


This function reads characters from the stream and stores them as a C string in str until (num-1) characters are read or a newline character or end-of-file is reached, whichever occurs first.

The newline character stops fgets from reading, but it is considered a valid character by the function and is included in the string copied to str.

Null characters (‘\0’) are automatically appended to characters copied to str.

fscanf


The function of this function is to read data from the stream and store it in the location pointed to by the additional parameter according to the parameter format.

fprintf


This function is to write data into the stream.

The above functions are all written to the file in text form.
The following two functions are written to the file in binary form.

fread


This function stores the data in the stream in the pointer ptr in a formatted form.
Introduction to the four parameters:

ptr Pointer to a memory block of size at least (sizecount) bytes, cast to void.

size The size, in bytes, of each element to read.

The number of count elements, each element’s size is in bytes.

stream Pointer to a FILE object specifying the input stream.

Change the return value of the function:
Returns the total number of elements successfully read.
If the returned number is different from the count parameter, it means that a read error occurred while reading or the end of the file was reached. In both cases metrics are set which can be checked with ferror and feof respectively.
If the size or count is zero, the function returns zero, and the stream state and what ptr points to remains unchanged.

fwrite


The parameters here are basically the same as fscanf, except that ptr has no size limit.

The examples of the above functions are all demonstrated with file streams. Let’s use standard streams to see below.
fputc

fprintf

fscanf

Let’s look at these functions:

scanf / fscanf / sscanf (input)
printf / fprintf / sprintf (output)

Let’s talk about sscanf and sprintf first.

This function converts a string into formatted data.


This function converts formatted data into a string.

See example:

What is the difference between them?

scanf is a formatted input function for the standard input stream (stdin).
printf is a formatted output function for the standard output stream (stdout).

fscanf is a formatted input function for all input streams (filestream/stdin).
fprintf is a formatted output function for all output streams (filestream/stdout).

sscanf can convert strings to formatted data.
sprintf can convert formatted data into strings.

5. Random reading and writing of files

The above file operation functions are sequential read and write, we can also use the following functions for random read and write.

fseek


This function repositions the stream position indicator. The simple understanding is to reset the position pointed by the pointer in the file.
Three parameters, the first is the stream, the second is the offset, and the third is the position to start resetting.

There are three options for the third parameter:
SEEK_SET beginning of file
SEEK_CUR current position of the file pointer
SEEK_END end of file

We pass the third parameter, if it is SEEK_CUR, then we pass a positive number in the second parameter, which means the backward offset, and if the negative number is passed, it means the forward offset.

Locate the position of the file pointer according to the position and offset of the file pointer.

Another example:

ftell

Returns the offset of the file pointer relative to the starting position.

rewind

Return the position of the file pointer to the beginning of the file.

6. Text files and binary files

Depending on how the data is organized, data files are called text files or binary files.
Data is stored in binary form in memory, and if it is output to external storage without conversion, it is a binary file.
If it is required to store in the form of ASCII code on the external storage, it needs to be converted before storage. A file stored in the form of ASCII characters is a text file.

How is a piece of data stored in memory?
Characters are all stored in ASCII form, and numeric data can be stored in either ASCII or binary form.
If there is an integer 10000, if it is output to the disk in the form of ASCII code, it will occupy 5 bytes (one byte for each character) on the disk, and if it is output in binary form, it will only occupy 4 bytes on the disk (VS2013 test ).

example:

The above is binary storage, if expressed in ASCII it is
Each grid above represents the ASCII of a number, which is 1 0 0 0 0 in sequence.

7. Judgment of the end of file reading

feof

Keep in mind: during the file reading process, the return value of the feof function cannot be used to directly determine whether the file is over.

Instead, it is applied when the file reading ends, judging whether the reading fails to end, or the end of the file is encountered.

  1. Whether the text file reading is over, determine whether the return value is EOF ( fgetc ), or NULL ( fgets )
    For example:
    fgetc judges whether it is EOF .
    fgets checks if the return value is NULL .
  2. Judging the end of reading the binary file, and judging whether the return value is less than the actual number to be read.
    For example:
    fread judges whether the return value is less than the actual number to be read.

Correct usage:

Take a text file as an example:

int main()
{<!-- -->
\t//open a file
FILE* pf = fopen("test.dat", "r");
if (pf == NULL)
{<!-- -->
perror("fopen");
exit(-1);
}

// file operation


char ch = 0;
while ((ch = fgetc(pf)) != EOF)
{<!-- -->
putchar(ch);
}
putchar('\\
');

if (feof(pf))
printf("end of file\\
");
else if (ferror(pf))
printf("error\\
");

// close the file
fclose(pf);
pf = NULL;

return 0;
}

Take the binary file as an example:

struct Stu
{<!-- -->
int age;
int weight;
char name[20];
};

int main()
{<!-- -->
\t//open a file
FILE* pf = fopen("test.dat", "r");
if (pf == NULL)
{<!-- -->
perror("fopen");
exit(-1);
}

/*struct Stu s = { 20, 70, "zhangsan" };
fwrite( &s, sizeof(s), 1, pf);
fwrite( &s, sizeof(s), 1, pf);
fwrite( &s, sizeof(s), 1, pf);
fwrite( &s, sizeof(s), 1, pf);
fwrite( & amp;s, sizeof(s), 1, pf);*/

struct Stu s[10];

int ret = fread(s, sizeof(struct Stu), 5, pf);

if (ret == 5)
{<!-- -->
printf("success\\
");
for (int i = 0; i < 5; i ++ )
{<!-- -->
printf("%d %d %s\\
", s[i].age, s[i].weight, s[i].name);
}
}
else
{<!-- -->
if (feof(pf))
{<!-- -->
printf("end\\
");
}
else if (ferror(pf))
{<!-- -->
printf("err\\
");
}
}


// close the file
fclose(pf);
pf = NULL;
return 0;
}

8. File buffer

The ANSIC standard uses the “buffer file system” to process data files. The so-called buffer file system means that the system automatically creates a “file buffer” in the memory for each file being used in the program. Data output from memory to disk will be sent to the buffer in memory first, and then sent to disk together after the buffer is filled. If data is read from the disk to the computer, the data read from the disk file is input to the memory buffer (full of the buffer), and then the data is sent from the buffer to the program data area (program variables, etc.) one by one. The size of the buffer is determined by the C compilation system.

It means that the data you pass will not be sent to the file at the first time, but will be sent to the buffer first, and then sent to the file after the data in the buffer is full. But you can manually flush the buffer, and you can output the data in the buffer to the file.

int main()
{<!-- -->
FILE*pf = fopen("test.txt", "w");
fputs("abcdef", pf);// put the code in the output buffer first
printf("Sleep for 10 seconds - the data has been written, open the test.txt file and find that the file has no content\\
");
Sleep(10000);
printf("Refresh buffer\\
");
fflush(pf);//When the buffer is flushed, the data in the output buffer is written to the file (disk)
//Note: fflush cannot be used on higher versions of VS
printf("Sleep for another 10 seconds - at this time, open the test.txt file again, the file has content\\
");
Sleep(10000);
fclose(pf);
//Note: fclose will also refresh the buffer when closing the file
pf = NULL;
return 0;
}

Because of the existence of the buffer, when the C language operates the file, it needs to refresh the buffer or close the file at the end of the file operation.

Failure to do so can cause problems reading and writing files.

Finish. .