Java implements word excel ppt template rendering, export and preview LibreOffice jodconverter

Java Office

1. Document format conversion

Document format conversion is an operation that is often required in office operations, such as converting docx documents into pdf format.

Java has many operating methods in this regard, which can be roughly divided into internal calls (no need to install additional software) and external calls (need to install additional software).

Among them, although the internal calling method is simple, it will encounter some headaches, such as: wrong document format, wrong font, and missing content. Although external calls are troublesome, they can solve these problems to a certain extent.

Recommended technical combination: jodconverter + LibreOffice

jodconverter: jodconverter is a Java OpenDocument converter that can convert documents in different formats and relies on Apache OpenOffice or LibreOffice.

LibreOffice: LibreOffice is a powerful office software that uses Open Document Format (ODF) by default and supports other formats such as docx, xlsx, pptx and so on.

jodconverter supports two open source Office software, LibreOffice and Apache OpenOffice. However, LibreOffice is more recommended in terms of stability, conversion effect, and simplicity.

1 LibreOffice installation

LibreOffice official website: https://www.libreoffice.org/

LibreOffice download address: https://www.libreoffice.org/download/download-libreoffice/

LibreOffice 7.5.6: https://www.libreoffice.org/donate/dl/win-x86_64/7.5.6/zh-CN/LibreOffice_7.5.6_Win_x86-64.msi

Among them, the official will generally release two versions, namely the latest version and the stable version. The stable version is recommended here. Download the corresponding installation package according to the version of your operating system.

During the installation process, just click Next and remember the installation path.

2 Project maven dependencies

<dependency>
    <groupId>org.jodconverter</groupId>
    <artifactId>jodconverter-local</artifactId>
    <version>4.4.6</version>
</dependency>

3 Code logic and implementation

  1. Create OfficeManager
  2. CreateConverter
  3. Create input and output streams
  4. Document format conversion
  5. Close data streams and programs
3.1 Create OfficeManager
LocalOfficeManager.Builder builder = LocalOfficeManager.builder();
//Set the local Office address, LibreOffice is recommended
builder.officeHome("D:/Program Files/LibreOffice");
// Deploy host and start locally
builder.hostName("127.0.0.1");
// Deployment port, you can set multiple
builder.portNumbers(9000, 9001, 9002);
// Single task expiration time Default: 120000 2 minutes
builder.taskExecutionTimeout((long) (5 * 1000 * 60));
// Task expiration time Default: 30000 3 seconds
builder.taskQueueTimeout((long) (1000 * 60 * 60));
//The maximum number of tasks that can be executed, default 200
builder.maxTasksPerProcess(1000);
// Construct
LocalOfficeManager manager = builder.build();
// start up
manager.start();
3.2 Create Converter
LocalConverter converter = LocalConverter.builder().officeManager(manager).build();
3.3 Create input and output streams
// Test converting word document to pdf
//Create input stream
FileInputStream input = new FileInputStream("E:/tmp/word/test.docx");
//Create output stream
FileOutputStream output = new FileOutputStream("E:/tmp/word/test.pdf");
3.4 Format conversion
//Convert format
converter.convert(input).as(DefaultDocumentFormatRegistry.DOCX)
        .to(output).as(DefaultDocumentFormatRegistry.PDF).execute();
3.5 Close the stream
//Close the stream
output.close();
input.close();
manager.stop();

4 Supported document types

public static final @NonNull DocumentFormat PDF = byExtension("pdf");
public static final @NonNull DocumentFormat SWF = byExtension("swf");
public static final @NonNull DocumentFormat HTML = byExtension("html");
public static final @NonNull DocumentFormat XHTML = byExtension("xhtml");
public static final @NonNull DocumentFormat ODT = byExtension("odt");
public static final @NonNull DocumentFormat OTT = byExtension("ott");
public static final @NonNull DocumentFormat FODT = byExtension("fodt");
public static final @NonNull DocumentFormat SXW = byExtension("sxw");
public static final @NonNull DocumentFormat DOC = byExtension("doc");
public static final @NonNull DocumentFormat DOCX = byExtension("docx");
public static final @NonNull DocumentFormat DOTX = byExtension("dotx");
public static final @NonNull DocumentFormat RTF = byExtension("rtf");
public static final @NonNull DocumentFormat WPD = byExtension("wpd");
public static final @NonNull DocumentFormat TXT = byExtension("txt");
public static final @NonNull DocumentFormat ODS = byExtension("ods");
public static final @NonNull DocumentFormat OTS = byExtension("ots");
public static final @NonNull DocumentFormat FODS = byExtension("fods");
public static final @NonNull DocumentFormat SXC = byExtension("sxc");
public static final @NonNull DocumentFormat XLS = byExtension("xls");
public static final @NonNull DocumentFormat XLSX = byExtension("xlsx");
public static final @NonNull DocumentFormat XLTX = byExtension("xltx");
public static final @NonNull DocumentFormat CSV = byExtension("csv");
public static final @NonNull DocumentFormat TSV = byExtension("tsv");
public static final @NonNull DocumentFormat ODP = byExtension("odp");
public static final @NonNull DocumentFormat OTP = byExtension("otp");
public static final @NonNull DocumentFormat FODP = byExtension("fodp");
public static final @NonNull DocumentFormat SXI = byExtension("sxi");
public static final @NonNull DocumentFormat PPT = byExtension("ppt");
public static final @NonNull DocumentFormat PPTX = byExtension("pptx");
public static final @NonNull DocumentFormat POTX = byExtension("potx");
public static final @NonNull DocumentFormat ODG = byExtension("odg");
public static final @NonNull DocumentFormat OTG = byExtension("otg");
public static final @NonNull DocumentFormat FODG = byExtension("fodg");
public static final @NonNull DocumentFormat SVG = byExtension("svg");
public static final @NonNull DocumentFormat VSD = byExtension("vsd");
public static final @NonNull DocumentFormat VSDX = byExtension("vsdx");
public static final @NonNull DocumentFormat PNG = byExtension("png");
public static final @NonNull DocumentFormat JPEG = byExtension("jpg");
public static final @NonNull DocumentFormat TIFF = byExtension("tif");
public static final @NonNull DocumentFormat GIF = byExtension("gif");
public static final @NonNull DocumentFormat BMP = byExtension("bmp");

5 Complete code

public static void main(String[] args) throws OfficeException, IOException {<!-- -->

    // =======================Build office manager====================== ===
    LocalOfficeManager.Builder builder = LocalOfficeManager.builder();
    //Set the local Office address, LibreOffice is recommended
    builder.officeHome("D:/Program Files/LibreOffice");
    // Deploy host and start locally
    builder.hostName("127.0.0.1");
    // Deployment port, you can set multiple
    builder.portNumbers(9000, 9001, 9002);
    // Single task expiration time Default: 120000 2 minutes
    builder.taskExecutionTimeout((long) (5 * 1000 * 60));
    // Task expiration time Default: 30000 3 seconds
    builder.taskQueueTimeout((long) (1000 * 60 * 60));
    //The maximum number of tasks that can be executed, default 200
    builder.maxTasksPerProcess(1000);
    // Construct
    LocalOfficeManager manager = builder.build();
    // start up
    manager.start();
    // ====================== Build document converter ======================
    LocalConverter converter = LocalConverter.builder().officeManager(manager).build();
    // ===================== Implement document conversion =======================
    //Test convert word document to pdf
    //Create input stream
    FileInputStream input = new FileInputStream("E:/tmp/word/test.docx");
    //Create output stream
    FileOutputStream output = new FileOutputStream("E:/tmp/word/test.pdf");
    //Convert format
    converter.convert(input).as(DefaultDocumentFormatRegistry.DOCX)
            .to(output).as(DefaultDocumentFormatRegistry.PDF).execute();
    // close the stream
    output.close();
    input.close();
    manager.stop();
}

renderings

After conversion

2. Spring Boot integration mode

jodconverter has an integrated solution for Spring Boot: jodconverter-spring-boot-starter

1 Project dependencies

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter</artifactId>
</dependency>

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-test</artifactId>
    <scope>test</scope>
</dependency>
<dependency>
    <groupId>org.jodconverter</groupId>
    <artifactId>jodconverter-spring-boot-starter</artifactId>
    <version>4.4.6</version>
</dependency>

2 Configuration file

jodconverter:
  local:
    office-home: D:/Program Files/LibreOffice
    enabled: true
    port-numbers:
      - 8100
      - 8101
      - 8102
      - 8103

3 Test single case

@SpringBootTest
class SpringBootOfficeApplicationTests {<!-- -->

    @Resource
    private LocalConverter converter;
    @Test
    void contextLoads() throws IOException, OfficeException {<!-- -->
        //Test convert word document to pdf
        //Create input stream
        FileInputStream input = new FileInputStream("E:/tmp/word/test.docx");
        //Create output stream
        FileOutputStream output = new FileOutputStream("E:/tmp/word/test.pdf");
        //Convert format
        converter.convert(input).as(DefaultDocumentFormatRegistry.DOCX)
                .to(output).as(DefaultDocumentFormatRegistry.PDF).execute();
        output.close();
        input.close();
    }

}

3. Document template rendering output

When developing an office project based on Java, some rendering and output work of document data needs to be completed, such as rendering data in a database into a table and then outputting PDF.

The currently recommended technology selection is

First, the template is drawn, the template needs to be filled with some special tags, and then converted to xml format.

Then, use the template engine to render the data and template.

Finally, use jodconverter to convert to pdf output.

The currently recommended template engine is freemarker

Take the output of pdf after rendering a word document as an example

1 Write template file

When writing templates, it is more recommended to use LibreOffice Writer, which is the client that comes with LibreOffice installation.

During the saving process, remember to save as: fodt format file.

FODT file is an extension of a Flat OpenDocument Text file. OpenDocument is an open document standard designed to provide a free and open file format for creating and editing documents. FODT files typically contain the contents of a text document, which can include text, formatting, images, and other document-related elements. A common use of this file format is with open source office suites such as LibreOffice and Apache OpenOffice.

2 Project Design

2.1 Project dependencies
<dependency>
    <groupId>org.jodconverter</groupId>
    <artifactId>jodconverter-local</artifactId>
    <version>4.4.6</version>
</dependency>

    org.freemarker
    freemarker
    2.3.32

2.2 Core logic

freemarker tool class

public class FreemarkerUtils {<!-- -->

    public static final Configuration CONFIGURATION;

    public static final String TEMPLATE_DIRECTORY = "E:/tmp/word";

    static {<!-- -->
        // initialization
        CONFIGURATION = new Configuration(Configuration.DEFAULT_INCOMPATIBLE_IMPROVEMENTS);
        //encoding
        CONFIGURATION.setDefaultEncoding("UTF-8");
        //Template folder path
        try {<!-- -->
            // CONFIGURATION.setClassForTemplateLoading(FreemarkerUtils.class, path);
            CONFIGURATION.setDirectoryForTemplateLoading(new File(TEMPLATE_DIRECTORY));
        } catch (IOException e) {<!-- -->
            throw new RuntimeException(e);
        }
    }

    public static String rendering(String templateName, Map<String, Object> params) throws IOException, TemplateException {<!-- -->
        Writer writer = new StringWriter();
        Template template = CONFIGURATION.getTemplate(templateName);
        template.process(params, writer);
        writer.close();
        return writer.toString();
    }
}
// ====================== Implement document conversion ==================== ===
//Test convert word document to pdf
Map<String, Object> map = new HashMap<>(3);
map.put("name", "Zhang Shan");
map.put("age", 18);
map.put("text", "Cheerful personality, enthusiastic and generous, full of sense of justice, diligent and studious, serious and responsible for work.");
String dom = FreemarkerUtils.rendering("template1.fodt", map);
//Create input stream
ByteArrayInputStream input = new ByteArrayInputStream(dom.getBytes());
//Create output stream
FileOutputStream output = new FileOutputStream("E:/tmp/word/template1.pdf");
//Convert format
converter.convert(input).as(DefaultDocumentFormatRegistry.DOCX)
        .to(output).as(DefaultDocumentFormatRegistry.PDF).execute();
2.3 Effect display

4. Implement document preview

If file format conversion and document rendering are based on file operations, document preview needs to lead to image operations.

Document preview requires converting the document to pdf and then converting it to an image for viewing.

Recommendation for converting pdf into images: apache.pdfbox

1 Project dependencies

<dependency>
    <groupId>org.apache.pdfbox</groupId>
    <artifactId>pdfbox</artifactId>
    <version>2.0.27</version>
</dependency>
<dependency>
    <groupId>org.apache.pdfbox</groupId>
    <artifactId>pdfbox-tools</artifactId>
    <version>2.0.27</version>
</dependency>

2 Specific code

//Create byte output stream
ByteArrayOutputStream output = new ByteArrayOutputStream();
//Convert format
converter.convert(input).as(DefaultDocumentFormatRegistry.DOCX)
        .to(output).as(DefaultDocumentFormatRegistry.PDF).execute();
//Create document
PDDocument document = PDDocument.load(output.toByteArray());
// read document
PDFRenderer pdfRenderer = new PDFRenderer(document);
//Save each picture in the document
for (int i = 0; i < document.getNumberOfPages(); i + + ) {<!-- -->
    BufferedImage bufferedImage = pdfRenderer.renderImageWithDPI(i, 600);
    ImageIO.write(bufferedImage, "PNG", new File("E:\tmp\word" + i + ".png"));
}