?♂? Personal homepage: @ aiperson’s personal homepage
?About the author: Python learner
I hope everyone will support us and we will make progress together!
If the article is helpful to you,
Welcome to comment Like Collection Add follow +
Table of Contents
1. Python processing Word
2. Fill in the data in the table of the Word document with one click
3. Past recommendations
1. Python processing Word
- The benefits of Python processing Word
There are many advantages to working with Word documents using Python, here are some of the major ones:
-
Automation and batch processing: Python can help you automate processing of large numbers of Word documents, such as replacing text in batches, generating reports, extracting data, etc. This saves a lot of time and labor.
-
Data Extraction and Transformation: If you need to extract specific data from a large number of Word documents, Python can easily achieve this. You can write scripts to search and extract the information you need, then convert it to other formats such as CSV, Excel, or a database.
-
Customization and freedom: Using Python to process Word documents means you can customize the process to suit your needs. You can create custom templates, styles, and layouts, and flexibly adjust them to your project needs.
-
Integration with other tools and technologies: Python can be integrated with many other libraries and technologies, such as databases, web services, data analysis tools, etc. This allows you to combine the processing of Word documents with other workflows to achieve more complex tasks.
-
Document generation and reporting: If you need to generate reports or documents on a regular basis, Python can do this by automatically generating text, charts, tables, etc. This is useful for data analysis, project management, business reporting, etc.
-
Version control and code reuse: Using Python code to process Word documents can achieve code reuse and version control. You can store the code that processes documents in a version control system so team members can collaborate, maintain, and share the code.
-
Cross-platform: Most Python libraries can run on multiple operating systems, so you can work with Word documents on different platforms such as Windows, Linux and macOS.
- Python third-party library for processing Word
In Python, there are several third-party libraries that can be used to process Word documents. The following are some commonly used third-party libraries:
-
python-docx: This is a popular library for creating and modifying Microsoft Word documents (.docx format). It allows you to read, edit and create Word documents, including paragraphs, tables, images, etc.
-
pywin32 (Windows only): This library allows you to interact with Microsoft Office applications on Windows, including Word. It can be used to automate operations such as creating, opening, editing, and saving documents.
-
docx2txt: This library can be used to convert .docx files to plain text (.txt format). It is suitable for extracting text content from Word documents.
-
python-docx-template: This library extends the
python-docx
library to enable you to insert variables and loops into templates and generate Word documents in a more dynamic way. -
py-docx: This is another library for processing .docx files. It provides functionality similar to
python-docx
, but uses a different interface.
2. Fill in the data in the table of the Word document with one click
Technical Tools:
Python version: 3.9
Code editor: jupyter notebook
At work, we often fill in part of the information in the Excel table into the corresponding table in the Word document to generate a report for easy printing. Today we will try how to fill in data in a fixed Word template. This example requires filling in the “receipt data” of the Excel file into the “receipt record” of the Word file, and adding up the received quantity. The template is as follows.
The required result is this:
First, the data needs to be extracted from the Excel table.
#Extract data from Excel table from openpyxl import load_workbook wb = load_workbook("receipt data.xlsx") ws = wb.active data=[] for row in range(2,ws.max_row + 1): seq = ws["A" + str(row)].value supplier = ws["B" + str(row)].value material_pn = ws["C" + str(row)].value material_model = ws["D" + str(row)].value desp = ws["E" + str(row)].value qty = ws["F" + str(row)].value date = ws["G" + str(row)].value.date() info = [seq, supplier, material_pn, material_model, desp, qty, date] data.append(info)
# Check the data data[0]
Since the data in the quantity column needs to be summed, a summation function is first defined so that it can be called later.
#Define quantity summation function def Sum_list(list): s = 0 for i in list: #accumulate all numbers in the list s + =i return s
# Sum all the numbers in the quantity column qty_list=[] for i in data: qty_list.append(i[5]) #The quantity is at the 6th position of the inner list, the index is 5 sum_qty= Sum_list(qty_list) #Call the summing function to add up sum_qty
Define cell merge function
#Define the function of merging cells def Merge_cells(table,target_list,start_row,col): ''' table: is the table that needs to be operated on target_list: is a target list, that is, a list containing duplicate data start_row: is the start row, that is, the row in the table where data comparison begins (the title needs to be removed) col: is the column that needs to process data ''' start = 0 #Start row counting end = 0 #end line count reference = target_list[0] #Set the reference, starting with the first string in the list for i in range(len(target_list)): #Traverse the list if target_list[i] != reference: #Start the comparison. If the contents are different, perform the following reference = target_list[i] #The reference becomes the next string in the list end=i-1 table.cell(start + start_row,col).merge(table.cell(end + start_row,col)) start=end+1 if i == len(target_list) - 1: #Traverse to the last line and do as follows end=i table.cell(start + start_row,col).merge(table.cell(end + start_row,col))
Then write data into the word table and merge cells.
#After the data extraction is completed, you can write data into the Word table. from docx import Document doc = Document("Receipt Record Template.docx") #Read the data in the second and third columns of the first table in the word document except the title and tail total rows table = doc.tables[0] #It has been determined to be the first table, and its index is 0 supplier = [] #Storage supplier name pn = [] #Storage material code for i in data: supplier.append(i[1]) pn.append(i[2]) #Add rows as needed to fill in data for i in range(len(supplier)): #There is already one line in the template, so you only need to add len(supplier) lines in total table.add_row() #After adding rows, do the merge cell operation first Merge_cells(table,supplier,1,1) #Start merging behavior 2, index is 1; supplier name is in column 2, index is 1 Merge_cells(table,pn,1,2) #Start merging behavior 2, index is 1; material code is in column 3, index is 2 #Write data to the table for row in range(1,len(supplier) + 1): for col in range(7): table.cell(row,col).text = str(data[row-1][col]) max_row = len(table.rows) #Get the largest row qty_row = max_row-1 #Determine a row that needs to be written with total data table.cell(qty_row,5).merge(table.cell(qty_row,5)) #Merge the two cells in the lower right corner used to fill in the quantity table.cell(qty_row,4).text = 'Total number:' table.cell(qty_row,5).text = str(sum_qty) doc.save("Receipt record.docx")
Since cells with the same content in the supplier name and material code columns are merged, the information in these two columns needs to be extracted separately from the total data `data`. Create two new empty lists `supplier` and `pn`, then traverse `data`, take the data out and store it in the empty list. Then, we add some blank lines according to actual needs to facilitate writing data later. The number of new rows is based on the number of elements in the `supplier` list. There is already a blank row in this table. If you include the outside rows and you need to fill in the “total number”, you only need to increase the number of elements in the `supplier` list to correspond to that many rows. Of course, this can also be based on the number of elements in the list `pn`, and the two are equal. After adding rows, first call the function `Merge_cells` to merge cells (if you write data first and then merge, the cell contents will be duplicated). After merging, you can write data to the table. Here, traverse the rows and columns where data needs to be written, and then extract the data from `data`. After the data row is written, the word “Total” and the total material quantity in the last row need to be written. Finally save the data and get the following results:
3. Past recommendations
Python extracts tabular data from pdf (with practical case)
Automatically send emails using Python
Python basics of operating ppt and pdf
Python operation word basics
Python basics of operating excel
Use Python to extract tables from PDF to Excel with one click
4. Recommendations and benefits at the end of the article
3 copies of “Python Automated Office Applications (ChatGPT Version)” are given away for free!
Introduction:
This book comprehensively and systematically introduces the automation solutions of Python language in common office scenarios. The book is divided into 5 parts and 21 chapters, including the basic knowledge of Python language, common methods of reading and writing data in Python, using Python to automatically operate Excel, using Python to automatically operate Word and PPT, and using Python to automatically operate files and folders, emails, PDF files, Pictures, videos, use Python for data visualization analysis and web page interaction, and use ChatGPT to easily advance to Python office automation.
This book is suitable for information workers of all levels. It can be used as an introductory guide for beginners to learn Python, and as a reference manual for intermediate and advanced automated office users. The large number of examples in the book are also suitable for readers to use for reference directly in their work.Editor’s recommendation:
Easily realize office automation with the help of ChatGPT and Python: Excel Home is created by many Microsoft global MVP experts. It uses a large number of examples to introduce the use of Python to operate Excel, Word, PPT and various objects involved in daily office work. Let ordinary office workers without programming experience be able to control Python, realize office automation in multiple scenarios, and improve work efficiency!
- Lucky draw method: 3 friends will be randomly selected from the comment area and given away for free!
- How to participate: Follow the blogger, like, collect, and comment in the comment area “Life is too short, refuse to get involved!” (Remember to like + collect, otherwise the draw will be invalid, and each person can comment up to three times!)
- Event deadline: 2023-09-28 20:00:00
Dangdang purchase link: http://product.dangdang.com/29566495.html
JD.com purchase link: https://item.jd.com/13953308.html
List announcement time: 2023-09-28 21:00:00