JAVA parsing EXCEL (JExcelAPI, POI, EasyExcel)

Foreword

Article directory
    • Preface
    • JExcelAPI
      • Demo
    • POI
      • HSSFWorkBook
      • XSSFWorkBook
        • Demo
      • SXSSFWorkBook
        • Demo
      • XSSFReader
        • Demo
    • EasyExcel
      • Demo

Demo code: https://github.com/RwTo/excel-demo
There are generally three ways to parse Excel in JAVA

JExcelAPI
POI
EasyExcel

JExcelAPI

Official website: https://jexcelapi.sourceforge.net/

  1. Only supports the 2003 version of Excel, that is, files with the suffix xls
  2. Adopts a stream processing model to read and write line by line – therefore, large amounts of data can be processed, and OOM generally does not occur.
Demo
<!--JExcelAPI-->
<dependency>
<groupId>net.sourceforge.jexcelapi</groupId>
<artifactId>jxl</artifactId>
<version>2.6.12</version>
</dependency>

Read xls file

public class ReadExcelDemo {
    public static void main(String[] args) {
        try {
            // 1. Open the Excel file
            String filePath = ExcelConstant.EXCEL_PATH_XLS;
            Workbook workbook = Workbook.getWorkbook(new java.io.File(filePath));

            // 2. Get the first worksheet
            Sheet sheet = workbook.getSheet(0);
            /**
             * -Xms64m -Xmx64m
             * row = 4000 col = 50
             * Normal reading
             */
            // 3. Traverse each row and read the data
            for (int row = 0; row < sheet.getRows(); row + + ) {
                for (int col = 0; col < sheet.getColumns(); col + + ) {
                    Cell cell = sheet.getCell(col, row);
                    String content = cell.getContents();
                    content + = cell.getCellFormat().getBackgroundColour().getDescription();
                    System.out.print(content + "\t");
                }
                System.out.println();
            }

            // 4. Close the workbook
            workbook.close();

        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

write xls file

public class WriteExcelDemo {
    public static void main(String[] args) {
        try {
            String filePath = ExcelConstant.EXCEL_PATH_XLS;
            // 1. Create workbook
            WritableWorkbook workbook = Workbook.createWorkbook(new File(filePath));

            // 2. Create worksheet
            WritableSheet sheet = workbook.createSheet("Sheet1", 0);

            // 3. Define cell color
            WritableCellFormat greenFormat = new WritableCellFormat();
            greenFormat.setBackground(jxl.format.Colour.GREEN);

            WritableCellFormat yellowFormat = new WritableCellFormat();
            yellowFormat.setBackground(jxl.format.Colour.YELLOW);
            /**
             * -Xms64m -Xmx64m
             * row = 4000 col = 50
             * Normal writing
             */
            // 4. Write data
            for (int row = 0; row < 4000; row + + ) {
                for (int col = 0; col < 50; col + + ) {
                    if(col%2 == 0){
                        Label label = new Label(col, row, "Cell " + (row + 1) + "-" + (col + 1),yellowFormat);
                        sheet.addCell(label);
                    }else{
                        Label label = new Label(col, row, "Cell " + (row + 1) + "-" + (col + 1),greenFormat);
                        sheet.addCell(label);
                    }
                }
            }

            // 5. Save the workbook
            workbook.write();
            workbook.close();

        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

POI

Official website: https://poi.apache.org/
It has rich functions, is widely used, and has many problems (such as OOM).
Four APIs to operate Excel
Official website introduction: https://poi.apache.org/components/spreadsheet/how-to.html

Document Object Model
HSSFWorkBook (rich in features, but OOM)
XSSFWorkBook (feature-rich, but easy to OOM)
Event processing model (streaming processing)
SXSSFWorkBook (only supports xlsx writing)
XSSFReader (only provides templates, users need to write them themselves)

POM file

<!-- Apache POI -->
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>4.1.2</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>4.1.2</version>
</dependency>
HSSFWorkBook
  1. Support parsing and writing of xls files
  2. Based on DOM, load the entire Excel file into memory, build a complete Excel document object model tree, and then parse and operate. When the file is large, memory overflow may occur.
    However, because the xls file supports a small number of styles and cells, OOM generally does not occur.
XSSFWorkBook
  1. Supports parsing and writing of xlsx files
  2. Based on DOM, load the entire Excel file into memory, build a complete Excel document object model tree, and then parse and operate. When the file is large, memory overflow may occur.
Demo

Write to excel(xls,xlsx)

public class WriteExcelDemo {
    public static void main(String[] args) {
        try {

            String filePath = ExcelConstant.EXCEL_PATH_XLS;
            //String filePath = ExcelConstant.EXCEL_PATH_XLSX;
            //Workbook workbook = new XSSFWorkbook(); //Supports writing of xlsx files
            Workbook workbook = new HSSFWorkbook(); //Supports writing of xls files
            Sheet sheet = workbook.createSheet("Sheet1");
            /**
             * -Xms64m -Xmx64m
             * row = 4000 col = 50
             *OOM occurs
             */
            for (int row = 0; row < 4000; row + + ) {
                Row excelRow = sheet.createRow(row);
                for (int col = 0; col < 50; col + + ) {
                    Cell cell = excelRow.createCell(col);
                    cell.setCellValue("Cell " + (row + 1) + "-" + (col + 1));
                }
            }

            FileOutputStream fileOutputStream = new FileOutputStream(filePath);
            workbook.write(fileOutputStream);

            fileOutputStream.close();
            workbook.close();

        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Read excel(xls,xlsx)

public class ReadExcelDemo {
    public static void main(String[] args) {
        try {
            String filePath = ExcelConstant.EXCEL_PATH_XLSX;
            //String filePath = ExcelConstant.EXCEL_PATH_XLSX;
            FileInputStream fileInputStream = new FileInputStream(filePath);
            //WorkbookFactory will automatically choose to use HSSFWorkBook or XSSFWorkBook according to the file type
            Workbook workbook = WorkbookFactory.create(fileInputStream);

            Sheet sheet = workbook.getSheetAt(0);
            /**
             * -Xms64m -Xmx64m
             * row = 4000 col = 50
             *OOM occurs
             */
            for (Row row : sheet) {
                for (Cell cell : row) {
                    String cellValue = getCellValueAsString(cell);
                    System.out.print(cellValue + "\t");
                }
                System.out.println();
            }

            fileInputStream.close();
            workbook.close();

        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    private static String getCellValueAsString(Cell cell) {
        if (cell == null) {
            return "";
        }

        switch (cell.getCellType()) {
            case STRING:
                return cell.getStringCellValue();
            case NUMERIC:
                return String.valueOf(cell.getNumericCellValue());
            case BOOLEAN:
                return String.valueOf(cell.getBooleanCellValue());
            default:
                return "";
        }
    }
}
SXSSFWorkBook
  1. Only supports xlsx writing
  2. Based on streaming processing, Excel is processed line by line, and disk space is exchanged for memory. Data can be efficiently written into Excel files while reducing memory usage.
  3. SXSSFWorkbook adopts a memory-optimized design to avoid the need to load all data into memory. It only buffers a limited number of lines in memory and discards the buffered data after writing to the output stream, thus reducing memory usage.
  4. SXSSFWorkbook supports setting the number of rows to keep in memory before writing data. By controlling the paging size, you can flexibly control memory usage, which is especially suitable for processing very large Excel files.
Demo

writexlsx

public class SXSSFWriteExcelDemo {
    public static void main(String[] args) {
        try {
            // Set custom temporary directory
            String customTempDirPath = "temp";
            System.setProperty("java.io.tmpdir", customTempDirPath);

            String filePath = ExcelConstant.EXCEL_PATH_XLSX_BIG;
            SXSSFWorkbook workbook = new SXSSFWorkbook(500);//rowAccessWindowSize is the number of records cached in memory, the default is 100
            Sheet sheet = workbook.createSheet("Sheet1");
            for (int row = 0; row < 100000; row + + ) {
                Row excelRow = sheet.createRow(row);
                for (int col = 0; col < 3; col + + ) {
                    Cell cell = excelRow.createCell(col);
                    cell.setCellValue("Cell " + (row + 1) + "-" + (col + 1));
                }
            }

            FileOutputStream fileOutputStream = new FileOutputStream(filePath);
            workbook.write(fileOutputStream);

            fileOutputStream.close();
            //Use the dispose() method to release (delete) the temporary resources used by SXSSFWorkbook. This step is important especially after writing a large amount of data.
            workbook.dispose();
            /*If you do not release it manually, the temporary files will be deleted by default when the virtual machine stops.
            poi.keep.tmp.files Through this configuration, you can control that temporary files are not deleted when the virtual machine is stopped*/

            //After the program is executed, manually delete the temporary file directory (see if necessary)
            deleteTempFiles(new File(customTempDirPath));
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
    private static void deleteTempFiles(File directory) {
        if (directory.isDirectory()) {
            File[] files = directory.listFiles();
            if (files != null) {
                for (File file : files) {
                    deleteTempFiles(file);
                }
            }
        }
        if (!directory.delete()) {
            System.err.println("Failed to delete temp file: " + directory.getAbsolutePath());
        } else {
            System.out.println("Deleted temp file: " + directory.getAbsolutePath());
        }
    }

}
XSSFReader

Official website case: https://poi.apache.org/components/spreadsheet/how-to.html#xssf_sax_api

  1. Only supports xlsx parsing
  2. Based on streaming processing, it reads cells line by line without saving the entire excel to memory, which is suitable for processing large excels.
Demo

Read xlsx

public class SAXWriteExcelDemo {

    public static void main(String[] args) throws Exception {
        String filePath = ExcelConstant.EXCEL_PATH_XLSX_BIG;
        InputStream is = new FileInputStream(filePath);
        OPCPackage opcPackage = OPCPackage.open(is);
        try {
            //Read Excel file
            XSSFReader reader = new XSSFReader(opcPackage);
            //Use event handler to process Sheet data
            SAXSheetHandler sheetHandler = new SAXSheetHandler();
            XSSFSheetXMLHandler xmlHandler = new XSSFSheetXMLHandler(reader.getStylesTable(), reader.getSharedStringsTable(), sheetHandler, false);
            XMLReader sheetParser = XMLReaderFactory.createXMLReader();
            sheetParser.setContentHandler(xmlHandler);
            Iterator<InputStream> sheetsData = reader.getSheetsData();
            while(sheetsData.hasNext()){
                InputStream sheetIs = sheetsData.next();
                InputSource sheet = new InputSource(sheetIs);
                //Start parsing the sheet page
                System.out.println("======================================Start processing sheet page== ============================================");
                long start = System.currentTimeMillis();
                sheetParser.parse(sheet);
                System.out.println("======================================End of sheet processing=== ===========================================");
                long end = System.currentTimeMillis();
                List<List<SAXCell>> curSheet = sheetHandler.getCurSheet();
                curSheet.forEach(System.out::println);

                System.out.println("Number of rows processed: " + sheetHandler.getRowCount());
                System.out.println("Number of cells processed: " + sheetHandler.getCellCount());
                System.out.println("Processing time (ms): " + (end-start));

                sheetHandler.clear();
            }
        } catch (IOException e) {
            e.printStackTrace();
        } catch (OpenXML4JException e) {
            e.printStackTrace();
        } catch (SAXException e) {
            e.printStackTrace();
        }finally {
            // Close the input stream and OPCPackage
            is.close();
            opcPackage.close();
        }
    }
}

class SAXSheetHandler implements XSSFSheetXMLHandler.SheetContentsHandler {
    //current page
    private List<List<SAXCell>> curSheet;
    //Current row
    private List<SAXCell> curRow;
    //Total number of cells
    private int cellCount;

    public SAXSheetHandler() {
        this.curSheet = new LinkedList<>();
    }

    /**
     * When a row processing starts
     *
     * @param i line number
     */
    @Override
    public void startRow(int i) {
        //Start processing a new row and initialize the current row
        curRow = new LinkedList<>();
    }

    /**
     * At the end of a line of processing
     *
     * @param i line number
     */
    @Override
    public void endRow(int i) {
        //The processing of one row is completed and the data of this row is stored in the sheet
        curSheet.add(curRow);
    }


    /**
     * Process cells (empty cells will not be read)
     *
     * @param cellReference cell name
     * @param formattedValue cell value
     * @param xssfComment annotation
     */
    @Override
    public void cell(String cellReference, String formattedValue, XSSFComment xssfComment) {
        SAXCell cell = new SAXCell(cellReference, formattedValue);
        curRow.add(cell);
        cellCount + + ;
    }

    @Override
    public void headerFooter(String text, boolean isHeader, String tagName) {
        //Header and footer
    }

    @Override
    public void endSheet() {
        //sheet processing ends
    }

    public List<List<SAXCell>> getCurSheet() {
        return curSheet;
    }

    public int getCellCount() {
        return cellCount;
    }

    public int getRowCount() {
        return curSheet.size();
    }

    public void clear(){
        curSheet = new LinkedList<>();
        cellCount = 0;
    }
}
@Data
class SAXCell {
    private String cellName;
    private String value;

    public SAXCell(String cellName, String value) {
        this.cellName = cellName;
        this.value = value;
    }
}

EasyExcel

Official website: https://easyexcel.opensource.alibaba.com/

  1. Supports parsing and writing of xlsx and xls files
  2. Streaming processing saves memory and can handle large excel
  3. Supports annotations, simple and easy to use
Demo

POM

<!--easyExcel-->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>easyexcel</artifactId>
<version>2.2.0-beta2</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter</artifactId>
</dependency>

Example

@Data
public class UserData {
    private String name;
    private int age;
    private String email;

    public UserData() {
    }

    public UserData(String name, int age, String email) {
        this.name = name;
        this.age = age;
        this.email = email;
    }
}

Write to excel(xls,xlsx)

public class EasyWriteExcelDemo {

    public static void main(String[] args) {
        String filePath = ExcelConstant.EE_EXCEL_PATH_XLSX_BIG;

        //Create a list of written data
        List<UserData> dataList = new ArrayList<>();
        dataList.add(new UserData("Alice", 25, "[email protected]"));
        dataList.add(new UserData("Bob", 30, "[email protected]"));
        dataList.add(new UserData("Charlie", 28, "[email protected]"));

        // Use EasyExcel to write Excel files
        EasyExcel.write(filePath, UserData.class).sheet("Sheet1").doWrite(dataList);
    }
}

Read excel(xls,xlsx)

public class EasyReadExcelDemo {

    public static void main(String[] args) {
        String filePath = ExcelConstant.EE_EXCEL_PATH_XLSX_BIG;

        // Use EasyExcel to read Excel files
        EasyExcel.read(filePath, UserData.class, new UserDataListener()).sheet().doRead();
    }

    public static class UserDataListener extends AnalysisEventListener<UserData> {
        private List<UserData> dataList = new ArrayList<>();

        @Override
        public void invoke(UserData data, AnalysisContext context) {
            dataList.add(data);
        }

        @Override
        public void doAfterAllAnalysed(AnalysisContext context) {
            // Here you can process the data in dataList, such as saving to the database or other operations
            for (UserData userData : dataList) {
                System.out.println(userData.getName() + "\t" + userData.getAge() + "\t" + userData.getEmail());
            }
        }
    }
}

The knowledge points of the article match the official knowledge files, and you can further learn related knowledge. Java Skill TreeHomepageOverview 139023 people are learning the system