python operation excel (openpyxl, xlrd&xlwt)

Table of Contents

  • Compare xlwt, openpyxl, xlrd
    • icon
    • Measured
  • xlrd & amp;xlwt read and write operations
    • Install
    • read operation
    • write operation
    • Excel time processing xldate_as_tuple
      • xlrd. xldate_as_tuple(xldate, datemode)
      • xlrd.xldate.xldate_as_datetime(xldate, datemode)
  • openpyxl read and write operations
    • openpyxl
      • read
      • to write

Compare xlwt, openpyxl, xlrd

Icon

To quote someone else’s picture:

  1. xlwt does not support writing xlsx files.
  2. openpyxl does not support reading xls files.
  3. xlrd supports reading xls, xlsx files.
  4. It is recommended to use xlrd for reading files and openpyxl for writing files.

Tested

A 3.64 MB excel file is prepared.

from functools import wraps
from time import time
import openpyxl
import xlrd


def timer(func):
    @wraps(func)
    def inner(*args, **kwargs):
        tmp = time()
        res = func(*args, **kwargs)
        print(time() - tmp)
        return res

    return inner


@timer
def openpyxl_test():
    fp = openpyxl.open('question bank.xlsx')
    sheet = fp['xxx']
    print(sheet.dimensions)
    for rows in sheet.iter_rows(min_row=1, max_row=8122, min_col=1, max_col=13):
        for cell in rows:
            pass
        pass


@timer
def xlrd_test():
    fp = xlrd.open_workbook('question bank.xlsx')
    sheet = fp.sheet_by_name('xxx')
    print(sheet. nrows, sheet. ncols)
    for index in range(sheet. nrows):
        row = sheet.row_values(index)
        pass


if __name__ == '__main__':
    openpyxl_test()
    >>>
    A1:M8122
7.71000599861145
\t
    xlrd_test()
    >>>
    8122 13
4.619961738586426

Read 8000 rows of data, openpyxl takes 7.71 seconds, and xlrd takes 4.62 seconds.

xlrd & amp;xlwt read and write operations

the difference:

  1. openyxl: can read and write xlsx and xlsm files, mainly for versions after Excel2007 (.xlsx)
  2. xlrd: can read xlsx, xls, xlsm files with high efficiency
  3. xlwt: mainly writes xls files and is efficient, but cannot execute xlsx files

Install

Read: pip3 install xlrd==1.2.0 (higher versions do not support xlsx)
Write: pip install xlwt

Read operation

import xlrd

# read file
df = xlrd.open_workbook('test.xlsx')

# show all sheets
print(df. sheet_names())
>>> ['Sheet1', 'Sheet2', 'Sheet3']

# Get all sheet objects
print(df. sheets())
[<xlrd.sheet.Sheet object at 0x000001995152D7C8>, <xlrd.sheet.Sheet object at 0x0000019951567648>, <xlrd.sheet.Sheet object at 0x0000019951567988>]

# Get the sheet object
table=df.sheets()[0]
table=df.sheet_by_index(0)
table=df.sheet_by_name('name')

# Get ranks
row=table.nrows
col=table.ncols

# Get the value of the entire row and column, the row and column values start from 0
table.row_values(num1)
table.column_values(num2)

# get cell value
cell=table.cell(0,0).value

Write operation

import xlwt

# create excel file
df2 = xlwt. Workbook()

# new table
table2 = df2.add_sheet('name')
table2=df2.add_sheet('name', cell_overwrite_ok=True) allows to write a cell repeatedly

# data input
table2.write(0,0,'Python')

# save
df2.save('data2.xls')

excel time processing xldate_as_tuple

When using xlrd to read the date format in the Excel table, what is read is a floating point number, which cannot be used correctly, and the xldate_as_tuple method can solve this problem well.

xlrd.xldate_as_tuple(xldate, datemode)

Parameter xldate: the cell value to be processed

Parameter datemode: time reference (0 represents 1900-01-01 as the reference, 1 represents 1904-01-01 as the reference); often use 1900 as the reference

Return value: returns a tuple, similar to (year,month,day,hour,minute,nearest_second)

However, it only works for excel cells of date type, not for time type cells

xlrd.xldate.xldate_as_datetime(xldate, datemode)

Parameter xldate: the cell value to be processed

Parameter datemode: time reference (0 represents 1900-01-01 as the reference, 1 represents 1904-01-01 as the reference); often use 1900 as the reference

Return value: datetime object, format year-month-day hour:minute:second

openpyxl read and write operations

Prerequisite knowledge:

Test Data:

openpyxl

Installation: pip install openpyxl

Read

from openpyxl import load_workbook

# import excel file
workbook = load_workbook(filename='test.xlsx')
or
workbook = open("test.xlsx")

# print all table names in this file
print(workbook. sheetnames)

# switch table
sheet = workbook['sheet1']

# print the table cell size
print(sheet.dimensions)

# Get a cell of the table
cell = sheet['A1']

# print the coordinates of the cell
print(cell.row, cell.column, cell.coordinate)

# Get multiple cells
for cell in sheet['A1:A10']
for cell in sheet['A1:G10']

for rows in sheet.iter_rows(min_row=1,max_row=5,min_col=2,max_col=7):
for cell in rows:
print(cell. value, end='')
print()

# Get the value of the cell object, [must be a cell object], if you get the tuple, you can't click the value
print(cell. value)

Write

# save
workbook.save('test.xlsx')

# Revise
cell = sheet['A1']
cell.value = 'A1'

# Append to the last last line
sheet.append([15, None, None, 14, 'qq.com'])

# add formula
sheet['K11'] = '=AVERAGE(K1:K10)'

#Add 4 lines to the second line
sheet.insert_rows(2,4)
# Add 4 columns at the second column
sheet.insert_cols(2,4)

# delete row
sheet.delete_rows(2,4)
# delete column
sheet. delete_cols(2,4)

# Be sure to save it at the end
workbook.save('test.xlsx')

# create excel
from openpyxl import Workbook
workbook = Workbook()