Is it too difficult to automate office work in Python? It’s enough to learn these

Someone asked on Zhihu: What knowledge do you need to learn to use Python for office automation?

This may be a confusion faced by many non-IT professionals. They want to use Python at work, but don’t know how to start? Python is becoming more and more popular in the field of automated office, and batch processing is simply a blessing for overtime workers.

Automated office is nothing more than excel, ppt, word, email, file processing, data analysis and processing, crawlers, etc. This time, let’s take a look at the knowledge points of python automated office.

  • Python basics

  • excel automation

  • ppt automation

  • word automation

  • Mail handling

  • Batch processing of files

  • Data processing and analysis

  • Automated crawler

Let’s explain them one by one in detail below.

Python basics

The prerequisite for being able to do this is that you can use Python. At the very least, you must be familiar with basic syntax and be able to write small scripts.

For the Python syntax requirements, you can check what you need to learn by referring to the basic Python tutorial, find a free video tutorial to follow, and then practice coding more. If you are used to reading, you can buy an introductory book on python for reference.

< /table>

Syntax is the key. You must understand the basic concepts of python programming before learning other tool libraries.

Otherwise it will be very painful.

excel automation

In fact, all office families can use VBA to solve automation problems, but many people may not know how to use it.

Python has many third-party libraries available for excel, such as xlwings, xlsxwriter, xlrd, xlwt, pandas, xlsxwriter, win32com, xlutils, etc.

These libraries can easily add, delete, rewrite, format, etc. to excel files. Of course, it is not recommended that you try them all, as this will cost too much time. It is enough to use xlwings and pandas, which can basically solve all problems of excel automation.

xlwing can not only read and write excel, but also perform format adjustments and VBA operations. It is very powerful and easy to use.

You can also check the specific usage of xlwings (Chinese summary):
https://www.jianshu.com/p/e21894fc5501
https://www.jianshu.com/p/b534e0d465f7
https://www.jianshu.com/p/de7efe591c12

Of course, it is best to read the official website tutorial:

https://www.xlwings.org/

Pandas is a data processing tool that everyone is familiar with. It also supports reading and writing Excel and has a friendly interface. This will be discussed later.

If you are interested in automating Excel processing with Python, you can also buy a special textbook.

ppt automation

Of course, python supports automated processing of ppt. The main libraries include pywin32com and pptx, which can create and modify ppt files.

It is recommended to use the pptx library, which is currently the mainstream ppt processing library.

Learning website:
https://python-pptx.readthedocs.io/en/latest/

word automation

Python library for operating Word:

  • python-docx, import docx: only valid for windows platform

  • pypiwin32, import win32com: cross-platform, but cannot handle word text in doc format. The doc format is not based on xml

  • textract, import textract: It takes into account both “doc” and “docx”, but the installation process requires some dependencies. You can use python to generate word files in batches. It is recommended to use docx, which does not require too much knowledge.

Learning website:
https://python-docx.readthedocs.io/en/latest/

Mail processing

Python is also extremely convenient for processing emails. The three libraries of smtplib, imaplib, and email are used together to realize a series of automated operations such as writing, sending, receiving, and reading emails, saving time and effort.

I have written a tutorial for sending emails, which can be used for personal testing:
Dry information | Free your hands and use Python to automatically send emails

After reading many other tutorials, there are various problems and bugs need to be constantly corrected, so you can run the above code first.

Batch processing of files

File processing includes batch modification or creation of file names, batch generation of documents, batch modification of paths, and other repetitive operations. If you do it manually one by one, it will be really tiring.

Python has a unique advantage in handling batch operations, and thousands of file modifications may only take a few seconds.

os is a library for Python file operations, which can add, delete, modify and check files on the computer.

Learning website:
https://www.runoob.com/python3/python3-os-file-methods.html
https://www.liaoxuefeng.com/wiki/1016959663602400/1017606916795776

Syntax Main content
Basic data types Immutable data (3): Number, String, Tuple
Variable data (3): List (List), Dictionary (Dictionary), Set (Collection)
Operators Arithmetic operators, logical operators , assignment operator, comparison operator, bit operator…
Numeric type Integer type (Int), floating point type (float), complex number (complex)
Conditional control statement if…elif…else statement
Loop statement While statement, for statement
Function def definition function, function call, parameter transfer, anonymous function…
Iteration Iteration process, iterator, generator, generator expression
File operation open() function, read, readline, readlines, write…methods
os module Processing system files and directories
Module Module import, commonly used standard modules, commonly used third-party libraries
Errors and exceptions try/except statement
Object-oriented Just master the object-oriented concept
Method Function
os.chdir(path) Change the current working directory
os.getcwd() Return to the current working directory
os.listdir() Returns a list of the names of files or folders contained in the folder specified by path
os.makedirs(path [, mode]) Create a folder named path
os.remove(path) Remove the path as file of path

Data processing and analysis

I do data analysis work, and python is basically the main tool, so this is undoubtedly the most valuable part of python office automation.

The main libraries for data processing include: pandas, numpy, matplotlib, sklearn…

pandas is an ever-improving Python data science library. Its data structure is very suitable for data processing, and pandas incorporates a large number of analytical function methods, as well as commonly used statistical models and visualization processing.

If you use python for data analysis, almost 90% of the work during data preprocessing needs to be completed using pandas.

In some written examination questions for companies recruiting analysts, pandas has been used as a required tool, so if you want to become a data analyst, please work hard to learn to use pandas.

Numpy is a numerical calculation library for Python, and many analysis libraries, including pandas, are built on numpy.

Numpy’s core features include:

  • ndarray, a fast and space-efficient multidimensional array with vector arithmetic operations and complex broadcast capabilities

  • Standard mathematical functions for fast operations on entire sets of data (no need to write loops)

  • Tools for reading and writing disk data and tools for manipulating memory mapped files

  • Linear algebra, random number generation, and Fourier transform functions

  • A C API for integrating code written in C, C++, Fortran, etc.

Numpy is particularly important for numerical calculations because it can efficiently handle large arrays of data. This is because:

  • Numpy arrays use less memory than Python’s built-in sequences

  • Numpy can perform complex calculations on entire arrays without the need for Python’s for loops

matplotlib and seaborn are the main visualization tools in python. It is recommended that everyone learn them. Data presentation and data analysis are equally important.

sklearn and keras, sklearn is a python machine learning library that covers most machine learning models. Keras is a deep learning library that includes the efficient numerical libraries Theano and TensorFlow.

These are familiar god libraries and highly recommended to learn.

Automated crawler

I believe crawlers are what everyone is most interested in. Python crawlers have many implementation libraries, such as urllib, requests, scrapy, etc., as well as parsing libraries such as xpath and beautifulsoup.

It is easy to get started with crawlers, but difficult to master, so beginners can try to write some simple crawlers, such as Douban, Zhihu, and Weibo.

Others

Other less commonly used automated office libraries, such as processing PDFs, pictures, video and audio, etc., will not be introduced here.
If you are interested, you can leave a message at the end of this article. What incredible python libraries have you used and what problems have you solved?

Finally:

Python learning materials

If you want to learn Python to help you automate your office, or are preparing to learn Python or are currently learning it, you should be able to use the following and get it if you need it.

① Python learning roadmap for all directions, knowing what to learn in each direction
② More than 100 Python course videos, covering essential basics, crawlers and data analysis
③ More than 100 Python practical cases, learning is no longer just theory
④ Huawei’s exclusive Python comic tutorial, you can also learn it on your mobile phone
⑤Real Python interview questions from Internet companies over the years, very convenient for review

There are ways to get it at the end of the article

1. Learning routes in all directions of Python

The Python all-direction route is to organize the commonly used technical points of Python to form a summary of knowledge points in various fields. Its usefulness is that you can find corresponding learning resources according to the above knowledge points to ensure that you learn more comprehensively.

2. Python course video

When we watch videos and learn, we can’t just move our eyes and brain but not our hands. The more scientific learning method is to use them after understanding. At this time, hands-on projects are very suitable.

3. Python practical cases

Optical theory is useless. You must learn to follow along and practice it in order to apply what you have learned to practice. At this time, you can learn from some practical cases.

Four Python Comics Tutorial

Use easy-to-understand comics to teach you to learn Python, making it easier for you to remember and not boring.

5. Internet company interview questions

We must learn Python to find a high-paying job. The following interview questions are the latest interview materials from first-tier Internet companies such as Alibaba, Tencent, Byte, etc., and Alibaba bosses have given authoritative answers. After finishing this set I believe everyone can find a satisfactory job based on the interview information.


This complete version of the complete set of Python learning materials has been uploaded to CSDN. If friends need it, you can also scan the official QR code of csdn below or click on the WeChat card at the bottom of the homepage and article to get the method. [Guaranteed 100% free]

syntaxbug.com © 2021 All Rights Reserved.