1. Experimental purpose
Proficient in the application of the built-in function open();
Understand the impact of string encoding format on text file operations;
Proficient in the usage of the context management statement with;
Understand how the standard library json reads and writes JSON files;
Understand how the extension libraries python-docx, openpyxl, and python-pptx operate on Office documents.
2. Experimental content
1. Merge the contents of two .txt files, and write the multiple lines of the two files to the result file alternately. If one file has less content, write the remaining content of the other file to the end of the result file.
1.1Experimental preparation: Prepare two Documents 1.txt 2.txt
1.2Experiment code:
def mergeTxt(txtFiles): with open('result.txt', 'w', encoding='utf-8') as fp: with open(txtFiles[0], 'r + ') as fp1, open(txtFiles[1], 'r + ') as fp2: # When the lines of two files can be read, enter the loop while True: line1 = fp1.readline() line2 = fp2.readline() # If there is content to read if line1 or line2: if line1:# line1 has content fp.write(line1) if line2:# line2 has content fp.write(line2) # If there is no content to read, break out of the loop else: break txtFiles = ['1.txt', '2.txt'] mergeTxt(txtFiles)
1.3Experimental results:
View the result.txt file:
2. Complete the code: write the following list of house information into the JSON file (house information.json), and then read and output the information.
2.1Experiment code:
import json information = [ {'Community name': 'Community A', 'Average price': 8000, 'Monthly transaction volume': 20}, {'Community name': 'Community B', 'Average price': 8500, 'Monthly transaction volume': 35}, {'Community name': 'Community C', 'Average price': 7800, 'Monthly transaction volume': 50}, {'Community name': 'Community D', 'Average price': 12000, 'Monthly transaction volume': 18}] #At this time information is a list type with open("housing information.json", "w") as f: # Convert data type to string and store in json file json.dump(information, f) print("File writing completed!") with open("housing information.json", 'r') as load_f: # Convert file opening from string to data type load_dict = json.load(load_f) print(load_dict)
2.2Experimental results:
3. Write a program to simulate and generate the turnover data of a hotel during its 100 consecutive days of trial operation starting from January 1, 2020, and write it into a CSV file. There are two columns in the file, the first column is the date, the second column is the turnover, and the first column of the file is the header or field name. Assume that the hotel’s base turnover on the first day is 500 yuan, and it increases by 5 yuan every day. In addition, it will randomly increase by 5 to 50 yuan every day.
3.1 Experiment code:
import csv import random import datetime with open('turnover.csv', 'w', encoding='utf-8') as f: writer = csv.writer(f, lineterminator='\ ') # Write field names to the csv file writer.writerow(['Date', 'Turnover']) # Generate data on January 1, 2020 date = datetime.date(2020,1,1) # Randomly generate 100 turnover data for i in range(100): # Sales will increase by 5 yuan every day, and in addition, it will increase randomly by 5 to 50 yuan every day. amount = 500 + i*5 + random.randint(5, 50) writer.writerow([str(date), amount]) # Accumulate and generate consecutive dates starting from January 1, 2020 date = date + datetime.timedelta(days=1)
3.2Experimental results:
View the “turnover.csv” file results:
4. Write a program to read and output the red text in the word document.
4.1Experiment preparation:
Prepare the contents of the 1.docx file:
4.2Experiment code:
from docx import Document # Need to download the python-docx library from docx.shared import RGBColor def CheckRed(fn): document = Document(fn) # Traverse the paragraphs of the document for p in document.paragraphs: # Traverse the character blocks of each segment for r in p.runs: # Determine whether the RGB value of the character block is red (255, 0, 0) if r.font.color.rgb == RGBColor(255, 0, 0): print(r.text) # Traverse the table for table in document.tables: # Traverse each row of each table: for row in table.rows: # Traverse the cells in each row: for cell in row.cells: # Traverse the contents of each cell: for p in cell.paragraphs: for r in p.runs: if r.font.color.rgb == RGBColor(255, 0, 0): print(r.text) CheckRed('1.docx')
4.3Experimental results:
5. The file “Supermarket Sales.xlsx” records the sales of each employee in a supermarket at different times and at different counters from March 1 to 5, 2019. Write a program to read the data in this file and count the total sales of each employee, the total sales of each period, and the total sales of each counter.
5.1Experiment preparation:
Period: 8:00-14:00 14:00-20:00
Counter: fresh fruits, daily necessities, snacks and drinks
Employees: 1001 Xiao Wang 1002 Xiao Li 1003 Xiao Chen 1004 Xiao Liu 1005 Xiao Zhang 1006 Xiao Du
5.2Experiment code:
from openpyxl import load_workbook def Count(): # Declare three dictionaries to store the total sales of employees, time periods, and counters respectively. staff = dict() periods = dict() counters = dict() # Read xlsx document ws = load_workbook('Supermarket turnover.xlsx').worksheets[0] # Traverse each row in the table for index, row in enumerate(ws.rows): # Need to skip the first line if index == 0: continue # Get employee name, time period, counter, sales and other information _, name, _, period, counter, money = map(lambda cell: cell.value, row) # Accumulate sales based on employee names staffs[name] = staffs.get(name, 0) + money #Accumulate sales according to different time periods periods[period] = periods.get(period, 0) + money # Accumulate sales according to different counters counters[counter] = counters.get(counter, 0) + money print("Total sales per employee:") print(staffs) print("Total sales for each period:") print(periods) print("Total sales per counter:") print(counters) Count()
5.3Experimental results:
6. Write a program to operate the Excel file, read the contents of the file “Everyone’s Specialties.xlsx”, count the contents of columns A-H, and append a column at the end to summarize everyone’s specialties.
6.1Experiment code:
import random from openpyxl import load_workbook # Function: Randomly generate table data def Random(): wb = load_workbook("Everyone's expertise.xlsx") for w1 in wb: w1 = wb.active wb.remove(w1) sheet = wb.create_sheet("Everyone's Specialties") list_name = ["Li Ming", "Wang Ping", "Liu Jing", "Chen Hong", "Zhao Qiang"] sheet.append(["Name", "Football", "Basketball", "Table Tennis", "Swimming", "Singing", "Dancing", "Eating"])# Add the header content of the first row # Randomly generate "yes" for each person's specialty and insert the results into the table for i in list_name: temp = [] temp.append(i) # For each specialty, use the choice function to randomly select "yes" and " " for j in range(7): temp.append(random.choice("Yes ")) sheet.append(temp) wb.save("Everyone's expertise.xlsx") # Function: summarize all specialties def Summarize(): wb = load_workbook("Everyone's expertise.xlsx") w = wb.active w["I1"] = "All specialties" list_sheet = [ "B", "C", "D", "E", "F", "G", "H"] list_column = ["football", "basketball", "table tennis", "swimming", "singing", "dancing", "eating"] # Traverse each student for i in range(2, w.max_row + 1): temp = [] # Traverse the specialties of each column for j in range(len(list_sheet)): # Check whether the value in column j and row i is "Yes" if w[list_sheet[j] + str(i)].value == "Yes": temp.append(list_column[j]) # Add the summary information of "all specialties" in column I and row i, separated by commas w["I" + str(i)] = ",".join(temp) wb.save("Everyone's Specialties (Summary Version).xlsx") Random() Summarize()
6.2Experimental results:
Contents of the “Everyone’s Specialties.xlsx” file randomly generated by the code:
Contents of the summarized result file “Everyone’s Specialties (Summary Version).xlsx”:
3. Experiment summary
In this experiment, I learned to use some libraries of Python to operate files. I completed 6 questions in total. The overall difficulty of the questions was not very difficult. I successfully obtained the expected results. I also learned various operations on files. The overall gain was very great. big. In addition, I also encountered some problems during the experiment, which were successfully solved after searching for information and consulting classmates. The records are as follows:
1. For question 4, find the red text in the document
In the program, I divided the text in the document into paragraph text and table text and traversed and searched respectively. There is no problem when searching for paragraph text, but when using the judgment statement if cell.font.color.rgb == RGBColor(255, 0, 0) when searching for table text, an error is reported: cell does not have a font attribute. After consulting my classmates, I found that the content in the cell can be regarded as “paragraph text” and traversed again. After this operation, the red text in the table was successfully found.
2. For question 6, summarize everyone’s strengths
When the program is running, an error occurs when zipfile.BadZipFile: File is not a zip file. The reason for the search is: the xlsx file cannot be created directly in the pycharm file, and it will be recognized as an invalid file. A real table file must be created in WPS.
When the program is running, an error message ‘Workbook’ object has no attribute ‘append’ is displayed. The reason is: data cannot be added directly to the workbook file. A new sheet must be created first. The corresponding statement is sheet = wb.create_sheet(“Everyone’s specialty”) , and then you can insert data into the sheet.
The program runs successfully, but the content of the “Everyone’s Specialty.xlsx” file is empty. To find the solution on CSDN, you need to add the following statement: for w1 in wb: w1 = wb.active wb.remove(w1)
3. For file data
At first, for the file data used in questions 5 and 6, I typed the data directly in Excel. Later, I tried to use python code to generate file data. I found that the latter can better train me to use the python library to analyze files. The ability to perform operations also enabled me to learn a lot of new knowledge about file operations.