Table of Contents
- Compare xlwt, openpyxl, xlrd
-
- icon
- Measured
- xlrd & amp;xlwt read and write operations
-
- Install
- read operation
- write operation
- Excel time processing xldate_as_tuple
-
- xlrd. xldate_as_tuple(xldate, datemode)
- xlrd.xldate.xldate_as_datetime(xldate, datemode)
- openpyxl read and write operations
-
- openpyxl
-
- read
- to write
Compare xlwt, openpyxl, xlrd
Icon
To quote someone else’s picture:
- xlwt does not support writing xlsx files.
- openpyxl does not support reading xls files.
- xlrd supports reading xls, xlsx files.
- It is recommended to use xlrd for reading files and openpyxl for writing files.
Tested
A 3.64 MB excel file is prepared.
from functools import wraps from time import time import openpyxl import xlrd def timer(func): @wraps(func) def inner(*args, **kwargs): tmp = time() res = func(*args, **kwargs) print(time() - tmp) return res return inner @timer def openpyxl_test(): fp = openpyxl.open('question bank.xlsx') sheet = fp['xxx'] print(sheet.dimensions) for rows in sheet.iter_rows(min_row=1, max_row=8122, min_col=1, max_col=13): for cell in rows: pass pass @timer def xlrd_test(): fp = xlrd.open_workbook('question bank.xlsx') sheet = fp.sheet_by_name('xxx') print(sheet. nrows, sheet. ncols) for index in range(sheet. nrows): row = sheet.row_values(index) pass if __name__ == '__main__': openpyxl_test() >>> A1:M8122 7.71000599861145 \t xlrd_test() >>> 8122 13 4.619961738586426
Read 8000 rows of data, openpyxl takes 7.71 seconds, and xlrd takes 4.62 seconds.
xlrd & amp;xlwt read and write operations
the difference:
- openyxl: can read and write xlsx and xlsm files, mainly for versions after Excel2007 (.xlsx)
- xlrd: can read xlsx, xls, xlsm files with high efficiency
- xlwt: mainly writes xls files and is efficient, but cannot execute xlsx files
Install
Read: pip3 install xlrd==1.2.0
(higher versions do not support xlsx)
Write: pip install xlwt
Read operation
import xlrd # read file df = xlrd.open_workbook('test.xlsx') # show all sheets print(df. sheet_names()) >>> ['Sheet1', 'Sheet2', 'Sheet3'] # Get all sheet objects print(df. sheets()) [<xlrd.sheet.Sheet object at 0x000001995152D7C8>, <xlrd.sheet.Sheet object at 0x0000019951567648>, <xlrd.sheet.Sheet object at 0x0000019951567988>] # Get the sheet object table=df.sheets()[0] table=df.sheet_by_index(0) table=df.sheet_by_name('name') # Get ranks row=table.nrows col=table.ncols # Get the value of the entire row and column, the row and column values start from 0 table.row_values(num1) table.column_values(num2) # get cell value cell=table.cell(0,0).value
Write operation
import xlwt # create excel file df2 = xlwt. Workbook() # new table table2 = df2.add_sheet('name') table2=df2.add_sheet('name', cell_overwrite_ok=True) allows to write a cell repeatedly # data input table2.write(0,0,'Python') # save df2.save('data2.xls')
excel time processing xldate_as_tuple
When using xlrd to read the date format in the Excel table, what is read is a floating point number, which cannot be used correctly, and the xldate_as_tuple method can solve this problem well.
xlrd.xldate_as_tuple(xldate, datemode)
Parameter xldate: the cell value to be processed
Parameter datemode: time reference (0 represents 1900-01-01 as the reference, 1 represents 1904-01-01 as the reference); often use 1900 as the reference
Return value: returns a tuple, similar to (year,month,day,hour,minute,nearest_second)
However, it only works for excel cells of date type, not for time type cells
xlrd.xldate.xldate_as_datetime(xldate, datemode)
Parameter xldate: the cell value to be processed
Parameter datemode: time reference (0 represents 1900-01-01 as the reference, 1 represents 1904-01-01 as the reference); often use 1900 as the reference
Return value: datetime object, format year-month-day hour:minute:second
openpyxl read and write operations
Prerequisite knowledge:
Test Data:
openpyxl
Installation: pip install openpyxl
Read
from openpyxl import load_workbook # import excel file workbook = load_workbook(filename='test.xlsx') or workbook = open("test.xlsx") # print all table names in this file print(workbook. sheetnames) # switch table sheet = workbook['sheet1'] # print the table cell size print(sheet.dimensions) # Get a cell of the table cell = sheet['A1'] # print the coordinates of the cell print(cell.row, cell.column, cell.coordinate) # Get multiple cells for cell in sheet['A1:A10'] for cell in sheet['A1:G10'] for rows in sheet.iter_rows(min_row=1,max_row=5,min_col=2,max_col=7): for cell in rows: print(cell. value, end='') print() # Get the value of the cell object, [must be a cell object], if you get the tuple, you can't click the value print(cell. value)
Write
# save workbook.save('test.xlsx') # Revise cell = sheet['A1'] cell.value = 'A1' # Append to the last last line sheet.append([15, None, None, 14, 'qq.com']) # add formula sheet['K11'] = '=AVERAGE(K1:K10)' #Add 4 lines to the second line sheet.insert_rows(2,4) # Add 4 columns at the second column sheet.insert_cols(2,4) # delete row sheet.delete_rows(2,4) # delete column sheet. delete_cols(2,4) # Be sure to save it at the end workbook.save('test.xlsx') # create excel from openpyxl import Workbook workbook = Workbook()