Python project realizes the reversal of file content and input (1) complete reversal

Article directory

  • refer to
  • describe
  • project
  • complete reversal
      • file pointer
      • Revisit
          • verify
          • seek() function
          • optimization
      • blank line at the end of the text
          • guess

Reference

Project Description
Search Engine Bing

Description

Project Description
Python 3.10.6
Operating system Ubuntu 22.04.2 LTS (64-bit)

Project

The purpose of this project is to reverse the contents of the object file and output the reversed result to another file (result file). Although the project looks relatively simple, we will implement it in three ways (full reversal, line-by-line implementation, and word-by-word implementation), and extend the relevant knowledge accordingly during the implementation process. At the end, we We will also compare the advantages and disadvantages of the three methods and where they are suitable. I believe this will give us a deeper understanding of the operations related to file reading and writing in Python.

Give me a chestnut

The content of the file target.txt is:

Hello World

Then our purpose is to reverse the contents of the file and save it to the file result.txt. Ultimately, the result.txt file will look like this:

dlroW olleH

Full inversion

Read all the contents of the target file into memory, and reverse the contents of the file on this basis. After the above operations are completed, output the reversed result to the result file. The specific implementation is as follows:

# Open target file target.txt in read-only mode
with open('target.txt', 'r') as ft:
    # Get the content saved in the target file and output it
    content = ft. read()
    print(content)
    # Dividing line
    print('-------------')
    
    # Open the file result.txt file as readable and writable
    with open('result.txt', 'w + ') as fr:
        # Reverse the content read from the target file
        result = content[::-1]
        # Output the reversed result to the result file result.txt
        fr. write(result)
        # Read the contents of the result file and output it
        if not fr. read():
            print('Not Content')
        else:
            print(fr. read())

Execution result

Hello World

-------------
Not Content

analysis

if not fr. read():
    print('Not Content')
else:
    print(fr. read())

The result of running the above code is Not Content . This means that after the fr.read() function was executed, nothing was read from the file. This doesn’t mean that the result.txt file doesn’t contain anything. Before going any further, allow me to touch on the concept of (file) pointers.

File pointer

The position where the built-in functions related to file operations in Python start to perform related operations will be stored by the file pointer.
When using the read() function, Python will read the contents of the file from the position pointed by the file pointer to the end of the file. The operation of the write() function differs depending on the mode of opening the file. If the file is opened in the write mode, the file pointer will point to the starting position of the file, and the function will start from here Write the content. If the file is opened in append mode, the file pointer will point to the end of the file, and the function will start writing the file from here.

Revisit

Authentication

In the previous example, we first wrote to the file using the write() function. Then use the read() function to read the file content. Note that writing to a file using the write() function will cause the file pointer to point to the end of the file. At this time, using the read() function to read the file content will not be able to read any content, so the output of the program is Not Content . We can verify this through the function tell.

with open('target.txt', 'r') as ft:
    content = ft. read()
    print(content)
    print('-------------')
    
    with open('result.txt', 'w + ') as fr:
        result = content[::-1]
        # Output the position pointed by the file pointer at this time
        print(fr. tell())
        fr. write(result)
        if not fr. read():
            print('Not Content')
            # Output the position pointed by the file pointer at this time
            print(fr. tell())
        else:
            print(fr. read())

Execution result

Hello World

-------------
0
Not Content
12

It can be seen that when starting to write the reversed file content to the result file, the position pointed by the file pointer is 0, which is the starting position of the file. After writing the file content to the file, the position pointed by the file pointer has changed, pointing to the end of the file. Why 12 and not 11 is discussed later.

seek() function

The seek() function accepts two parameters, where offset is required and base is optional. The corresponding functions are as follows:

item description
offset Specifies the offset of the file pointer relative to the offset base, which can be any integer, including negative integers.
base Specify the offset base of the file pointer, the default value is 0 . If the value is 0, it indicates that the offset reference of the file pointer is the starting position of the file. If the value is 1, it means that the offset base of the file pointer is the position pointed by the file pointer at this time. If the value is 2, it indicates that the offset reference of the file pointer is the end of the file.

Note:

  1. The seek() function will return the modified position of the position pointed by the file pointer.
  2. When using the seek() function, the file pointer cannot point to a negative position. Otherwise, Python will throw an error. For this, please refer to the following example:
fd = open('result.txt', 'r')
print(fd. seek(-1, 0))
fd. close()

throws an error

ValueError: negative seek position -1
  1. If the value of the second parameter specified for the seek() function is 1 or 2, then you need to parameter is set to zero. Otherwise, Python will throw an error. This is because, if the file is not opened in binary mode, only offsets are allowed based on the starting position of the file.
  2. The position pointed by the file pointer can exceed the total number of characters in the file, but the file will not create blank characters to fill the gap between the end of the file and the pointer, which means that if the input content to the file exceeds the file content, it will append to the end of the file. For this, please refer to the following example:
# Open the file as binary readable and writable
fo = open('result.txt', 'rb + ')
# Move the file 30 characters past the end of the file
print(fo. seek(30, 2))
# Input content to the file
#(Since we open the file in binary form, we need to encode the written content)
fo.write('RedHeart'.encode())
# Output, the position pointed by the file pointer at this time
print(fo. tell())
# move the pointer to the beginning of the file
print(fo. seek(0, 0))
# read everything in the file
# Since we open the file in binary form, we need to decode the read content)
print(fo. read(). decode())
fo. close()

Execution effect

After executing the program five times, the output is as follows (the first time the program was executed, there was nothing in the file):

182
190
0
Red Heart Red Heart Red Heart Red Heart Red Heart Red Heart

But when we use an editor (here, I am using gedit editor) to view the file, you may get the following content:

\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\ 00\00\00\00\00\00\00\00\00\00\00\00\00\00\00RedHeart\00\00\ \00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00 \00\00\00\00\00\00\00\00\00\00\00RedHeart\00\00\00\00\00\ 00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\ \0\00\00\00\00\00\00\00RedHeart\00\00\00\00\00\00\00\00\00 \00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\ 00\00\00\00\00RedHeart\00\00\00\00\00\00\00\00\00\00\00\00\ \00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00 \00RedHeart

That is, Python uses the binary data \00 to fill the gap between the end of the file (before data is entered into the file) and the position pointed by the file pointer when the content was entered into the file.

Optimization

Fortunately, Python provides us with the function seek() that can modify the position pointed by the file pointer, so that we can process files more flexibly.

We can use the seek() function to optimize the previous example, and the optimized result is as follows:

with open('target.txt', 'r') as ft:
    content = ft. read()
    print(content)
    print('-------------')
    
    with open('result.txt', 'w + ') as fr:
        result = content[::-1]
        fr. write(result)
        # Use the seek function to point the file pointer to the beginning of the file
        fr. seek(0, 0)
        if not fr. read:
            print('Not Content')
        else:
            print(fr. read())

Execution result

Hello World

-------------

dlroW olleH

Blank line at the end of the text

For the blank line at the end of the read result, my understanding was originally:
The occurrence of the end-of-file (Eed Of File, EOF) marks the end of the document. Python will stop reading the file after reading this character, and this character will be parsed as a newline character after being read by Python. Therefore, there is one more blank line in the read result.

To this end, I executed the following code to read the contents of the result file.

with open('result.txt', 'r') as fo:
    print(fo. read())

Execution result

dlroW olleH

If according to the original guess, the output should be (because after inputting the inverted result into the result file, the end of the file will appear again at the end of the file, according to the guess, the end of file will be converted after being read for line breaks):

dlroW olleH

This shows that the speculation was wrong.

Guess

The target file is written by the editor Vim commonly used in the Linux system, and the Python program we write is used to read the file, so the following guesses are made:

The editor in the Linux system will automatically add a newline character \\
at the end of the file after writing the game content.

We can verify this with the following code:

with open('target.txt', 'br') as fo:
    print(fo. read())

Execution result

Since we opened the file in binary mode, we can see newline characters \\
in the output of the file.

b'Hello World\\
'

A newline at the end of the file indicates that our guess was correct. As for why the Linux editor adds a newline character at the end of the file, you can use a search engine to search, and I believe you will be able to gain something.