8 practical Python automation office skills

Usually when I do data analysis in the company, I also use python to do some office automation work. The leader said yesterday that we can finish other people’s 3-hour work in 3 minutes. O(∩_∩)O~

This article will introduce to you a few office automation skills I use:

1. Word document doc to docx

Last year, I wanted to participate in a data competition. The data in it were all in doc format. I wanted to use python-docx to read the data in word files, but python-docx only supports docx format, so I studied the conversion of these two formats.

1.1 Import Toolkit

import osfrom win32com import client as wc

1.2 Obtain the details of all doc files under the folder

# path
path=C:/Users/yyz/Desktop/python office skills/data/doc to docx/ # Modify according to your own computer files

# Define an empty list to store the absolute path of the file
files = []
for file in os.listdir(path):
    if file.endswith(.doc):
        files.append(path + file)
files


1.3 Converting files

# Run word program
word = wc.Dispatch("Word.Application")
# for loop
i = 0
for file in files:
    try:
        doc = word.Documents.Open(file) #Open word file
        doc.SaveAs("{}x".format(file), 12) #Save as a file with the suffix ".docx", where parameter 12 refers to the docx file
        doc.Close() #Close the original word file
        print(file + ': converted successfully')
        i + =1
    except:
        print(file + ': conversion [unsuccessful]')
        files.append(file) # If an error is reported while reading the file, add the file name to the files list and read again
        pass
print('convert file %i'%i)
# exit word
word. Quit()

2. Convert text addresses to latitude and longitude in batches

During work, the latitude and longitude of the address will be used for map visualization or distance calculation.

2.1 Import Toolkit

# import toolkit
import pandas as pd
import json
from urllib. request import urlopen, quote
import requests

2.2 Define the conversion function

# define function
def getlnglat(address):
    url = 'http://api.map.baidu.com/geocoding/v3/'
    output = 'json'
    ak = "self-applied api" # Baidu map API, need to apply by yourself
    address = quote(address) # Since the address variable in this article is in Chinese, in order to prevent garbled characters, first use quote to encode
    uri = url + '?' + 'address=' + address + ' & amp;output=' + output + ' & amp;ak=' + ak + ' & amp;callback= showLocation ' + '//GET request'
    res=requests.get(uri).text
    temp = json.loads(res) # convert string to json
    lat = temp['result']['location']['lat']
    lng = temp['result']['location']['lng']
    return lng, lat # longitude longitude, latitude latitude,

2.3 Address translation
2.3.1 Single address translation

# single address translation
getlnglat('Gaobeidian District Office, Chaoyang District, Beijing Gaojing Village Committee')
(116.52784003604923, 39.91806508560947)

2.3.2 Batch Address Translation

# read data
data = pd.read_excel('C:/Users/yyz/Desktop/python office skills/data/address information.xlsx')
data

data['longitude'] = ''
data['latitude'] = ''
for i in range(data. shape[0]):
    try:
        data.iloc[i,2] = getlnglat(data.iloc[i,1])[0] # Longitude Convert the address of row i and column 2 (column index is 1) into latitude and longitude, and assign the longitude to Row i, column 3 (column index is 2)
        data.iloc[i,3] = getlnglat(data.iloc[i,1])[1] # latitude
    except:
        pass
    #print(i)
data

3. Calculate the distance by latitude and longitude

installation kit

pip install geopy` `

3.1 Import Toolkit

from geopy.distance import geodesic

3.2 Read data

# read data
data = pd.read_excel('C:/Users/yyz/Desktop/python office skills/data/latitude and longitude calculation distance.xlsx')
data


3.3 Calculate the distance

# Assign latitude and longitude to variables, simplify
wd1 = data['latitude 1'].tolist()
jd1 = data['longitude 1'].tolist()
wd2 = data['latitude 2'].tolist()
jd2 = data['longitude 2'].tolist()

lis1 = []
for i in range(len(data)):
    j= geodesic((wd1[i],jd1[i]), (wd2[i],jd2[i])).km # latitude longitude latitude longitude
    lis1.append(j)
    #print(i)

data['distance'] = lis1
data

4. Convert Baidu longitude and latitude to Gaode longitude and latitude

The company has two systems, the coordinate systems used are different, sometimes it needs to be converted

4.1 Toolkit

# import toolkit
import math import pandas as pd

4.2 Defining functions

# define conversion function
def bdToGaoDe(lon,lat):
    PI = 3.14159265358979324 * 3000.0 / 180.0
    x = lon - 0.0065
    y = lat - 0.006
    z = math. sqrt(x * x + y * y) - 0.00002 * math. sin(y * PI)
    theta = math.atan2(y, x) - 0.000003 * math.cos(x * PI)
    lon = z * math.cos(theta)
    lat = z * math. sin(theta)
    return lon,lat

4.3 Single Transformation

 # single conversion
bdToGaoDe(116.512885, 39.847469)
(116.50647396357492, 39.84120409781157)

4.4 Batch conversion

# read data
data = pd.read_excel('C:/Users/yyz/Desktop/python office skills/data/Baidu longitude and latitude to Gaode.xlsx')
data. head()

wd = data['latitude'].tolist()
jd = data['longitude'].tolist()
# define an empty list
li1 = []
for i in range(len(data)):
    j = bdToGaoDe(jd[i],wd[i])
    li1.append(j)
    
li1
data['longitude_re'] = [i[0] for i in li1]
data['latitude_re'] = [i[1] for i in li1]
data. head()

5. Excel file batch merge

5.1 Toolkit

# import toolkit import pandas as pdimport os

5.2 Get file list

# set file path
path = 'C:/Users/yyz/Desktop/python office skills/data/data merge/'
# Empty list, used to store file paths
files = []
for file in os.listdir(path):
    if file.endswith(.xlsx):
        files.append(path + file)

# view list
files

5.3 Convert storage data

# define an empty dataframe
data = pd. DataFrame()

# Loop through all files
for file in files:
    datai = pd. read_excel(file)
    datai_len = len(datai)
    data = data.append(datai) # add to the total data
    print('Read %i row data, merged file %i column, name: %s'%(datai_len,len(data.columns),file.split('/')[-1]) )
    # Check if all are read and whether the format is wrong
# reset index
data.reset_index(drop=True, inplace=True)

6. Convert Word files to pdf in batches

Only docx files can be converted, and an error will be reported when converting doc files. Toolkit installation

pip install docx2pdf

6.1 Import Toolkit

# Install Toolkit:``# Import Toolkit``from docx2pdf import convert``import os

6.2 Single Transformation

# Single conversion``convert("c:/users/yyz/desktop/Rubiccube formula.docx", "c:/users/yyz/desktop/excel notes.pdf")

6.3 Batch conversion?

# file location
path = 'C:/Users/yyz/Desktop/python office skills/data/word to pdf/'
# Define an empty list to store the file list
files = []
for file in os.listdir(path):
    if file.endswith(".docx"):
        files.append(path + file)
files
for file in files:
   convert(file,file.split('.')[0] + '.pdf')
   print(file + 'converted successfully')

7. Batch read table data in word

toolkit installation

pip install python-docx
import docx
# read word file
doc = docx.Document('C:/Users/yyz/Desktop/python office skills/data/word information.docx')
# Get a list of all table objects in the document
biauges = doc.tables

7.2 Irregular forms

cells = biauges[1]._cells``cells_lis = [[cell.text for cell in cells]]

import pandas as pd
import numpy as np
datai = pd. DataFrame(cells_lis)
datai = datai[[1,3,7,9,14,16,19,21]]
datai.columns = ['Name','Age','Native Place','Address','Work Unit','Phone','Whether a Party Member',\ 'date of birth']
datai


7.3 Canonical data

# Get the first table row丨
rowi = len(biaoges[0].rows)
rowi
# define an empty list
lis1 = []
# for loop to get the data of the first table
for i in range(1,rowi): # loop from row 2
    lis1.append([biaoges[0].cell(i,0).text,
                 biauges[0].cell(i,1).text,
                 biauges[0].cell(i,2).text,
                 biauges[0].cell(i,3).text,
                 biauges[0].cell(i,4).text])

# create a dataframe
data1 = pd.DataFrame(lis1,columns=['date','category','quantity','price','amount'])
data1

7.4 Batch read

import pandas as pd
import os
os.chdir('C:/Users/yyz/Desktop/python office skills/data/word information/')
lis1=[]
for file in os.listdir('.'):
    if file.endswith('.docx'):
        doc = docx. Document('./' + file)
        biauges = doc.tables
        rowi = len(biaoges[0].rows)
        for i in range(1,rowi):
            lis1.append([biaoges[0].cell(i,0).text,
                     biauges[0].cell(i,1).text,
                     biauges[0].cell(i,2).text,
                     biauges[0].cell(i,3).text,
                     biauges[0].cell(i,4).text])
# create dataframe
data1 = pd.DataFrame(lis1,columns=['date','category','quantity','price','amount'])
data1

8 Use Outlook to send emails in batches

8.1 Import Toolkit

import win32com.client as win32
import pandas as pd

8.2 Reading data

# read data
data1 = pd.read_excel('C:/Users/yyz/Desktop/python send mail in batches.xlsx', sheet_name='send mail')
data1.fillna('',inplace=True)

8.3 Send email

# run outlook
outlook = win32.Dispatch("outlook.Application")
# for loop to send files
for i in range(data1. shape[0]):
    mail = outlook.CreateItem(0) # Create a mail object win32.constants.olMailItem
    mail.To = data1.iloc[i,0] #recipient
    mail.CC = data1.iloc[i,1] #cc person
    mail.Subject = data1.iloc[i,2] #Mail subject
    mail.HTMLBody = data1.iloc[i,3] # Email body in html format
   # mail.Body = data1.iloc[i,3] # mail body
    mail.Attachments.Add(data1.iloc[i,4]) # attachment
    mail.Send() #send
    i + =1
print('send mail %i copies'%i)

There are still many skills in python office automation. Python is easy to master and can help us improve work efficiency. This is one of the reasons why many non-programmers learn python.

Reader benefits: If you also like programming and want to get a higher salary by learning Python, then this set of Python learning materials must be useful to you!

For 0 basic beginners:

If you are a zero-based novice, you can consider getting started with Python quickly
On the one hand, the learning time is relatively short, and the learning content is more comprehensive and concentrated
On the other hand, you can find a learning plan that suits you

Including: Python installation package + activation code, Python web development, Python crawler, Python data analysis, artificial intelligence, machine learning and other tutorials. Take you to learn Python systematically from zero foundation!

Reader Benefits: 2023 Zero-Basic Python Essentials (Video + Source Code + Tools + Software) Safe Links for Free Collection

1. Learning routes in all directions of Python

The route of all directions in Python is to organize the commonly used technical points of Python to form a summary of knowledge points in various fields. Its usefulness lies in that you can find corresponding learning resources according to the above knowledge points to ensure that you learn more comprehensively.

Second, learning software

If a worker wants to do a good job, he must first sharpen his tools. The commonly used development software for learning Python is here, which saves you a lot of time.

3. Introductory learning video

When we watch videos and learn, we can’t just move our eyes and brain without using our hands. A more scientific learning method is to use them after understanding. At this time, the hands-on project is very suitable.

Fourth, actual combat cases

Optical theory is useless, you have to learn to follow along, and you have to do it yourself, so that you can apply what you have learned to practice. At this time, you can learn from some actual combat cases.

Five, 100 Python practice questions

Check the learning results.

6. Interview materials

We must learn Python to find high-paying jobs. The following interview questions are the latest interview materials from first-line Internet companies such as Ali, Tencent, and Byte, and Ali bosses have given authoritative answers. After finishing this set The interview materials believe that everyone can find a satisfactory job.


This full version of the full set of Python learning materials has been uploaded. If you need it, you can scan the CSDN official certification QR code below or click the link to get it for free Guaranteed 100% Free

Reader benefits: 2023 zero-basic Python essential materials (video + source code + tools + software) secure link for free