Java uses pdfbox to dynamically generate PDF

Apache PDFBox is a Java library for working with PDF documents. It provides many functions and methods to read, create, manipulate and extract the content of PDF documents.

Introduce maven dependencies

<!-- https://mvnrepository.com/artifact/org.apache.pdfbox/pdfbox -->
<dependency>
    <groupId>org.apache.pdfbox</groupId>
    <artifactId>pdfbox</artifactId>
    <version>2.0.24</version>
</dependency>

pdfbox generates pdf examples

 try {
       // Create a blank PDF document
       PDDocument document = new PDDocument();
       // create a page
       PDPage page = new PDPage(PDRectangle.A4);
       document. addPage(page);
       // create a content stream
       PDPageContentStream contentStream = new PDPageContentStream(document, page);
       // set font and font size
       contentStream.setFont(PDType1Font.HELVETICA_BOLD, 12);
       // draw text on the page
       contentStream.beginText();
       contentStream. newLineAtOffset(100, 700);
       contentStream.showText("Hello, World!");
       contentStream. endText();
       // close the content stream
       contentStream. close();
       // Save the PDF document
       document.save("output.pdf");
       // close the PDF document
       document. close();
       System.out.println("PDF generated successfully!");
   } catch (IOException e) {
       e.printStackTrace();
   }

Common method

PDDocument class

Refer to the description of the PDDocument class in the source code

This is the in-memory representation of the PDF document

This is the memory representation of a PDF document. In a java program, you can simply understand that it is a pdf document, and a series of subsequent operations on it are a series of operations on the pdf document.

Create a brand new pdf document: no pages in the document

PDDocument document=new PDDocument();

If you want to fill the original pdf template with dynamic data, you can use the PDDocument.load() method to load the already made pdf template,

PDDocument document = PDDocument.load(new ClassPathResource("/static/reportTemplate.pdf").getInputStream());

You can also load the pdf template as a file, but the file stream is more recommended

PDDocument document = PDDocument.load(new ClassPathResource("/static/reportTemplate.pdf").getFile());

If you want to encrypt the generated pdf, you can use the PDDocument load(InputStream input, String password) method, and set the decrypted password to 123456 as follows.

PDDocument document = PDDocument.load(new ClassPathResource("/static/reportTemplate.pdf").getInputStream(),"123456");

There are many overloaded methods in PDDocument.load(), so I won’t list them here. Those who are interested can view the source code of pdfbox,

ByteArrayOutputStream baos = new ByteArrayOutputStream();;
document.save(baos); //Save the file to the file stream

document.save("output.pdf"); //Save the file to the file

After saving as a file stream, sometimes we need to transfer the file to the front end for downloading.

// Convert PDF file to byte array
byte[] pdfBytes = baos.toByteArray();

// Create an InputStreamResource object
ByteArrayInputStream bis = new ByteArrayInputStream(pdfBytes);
InputStreamResource resource = new InputStreamResource(bis);

// Set HTTP response header information
HttpHeaders headers = new HttpHeaders();
       headers.add(HttpHeaders.CONTENT_DISPOSITION, "attachment; filename=output.pdf");
headers.add(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_PDF_VALUE);
// return response entity with PDF content
return ResponseEntity. ok()
       .headers(headers)
       .body(resource);

After completing the document operation, be sure to execute the document.close() method to close the pdf document.

document. close();

PDPage class

PDPage belongs to the pages in the pdf document,

int pageNumber=document. getNumberOfPages();

Get the specified page,

PDPage page = document. getPage(0);

If you are operating on a pdf template, you can use the document.getPage(index) method to obtain the specified page of the pdf document and operate on it (index starts from 0). You can also create a brand new page through new PDPage();

PDPage newPage = new PDPage(PDRectangle.A4);

If we generate a page page through new PDPage(), we need to add the page page to the pdf document (document),

document. addPage(newPage);

However, this method will add the page to the end of the pdf document. Sometimes we need to add the page to the specified location. The following method can be used.

PDPage page=document.getPage(1); //Get the second page
PDPage newPage = new PDPage(PDRectangle.A4);
PDPageTree pages = document. getPages();
pages.insertAfter(newPage,page); //Insert after page 2
pages.insertBefore(newPage,page); //Insert before page 2

Get the total height and width of the page, which is useful in the subsequent text coordinate positioning. In the page, the origin coordinates are located in the lower left corner. If you want your element to have a left margin of 10 and a top margin of 10, then your coordinates will be (10, pageHeight-10)

float pageWidth = page.getMediaBox().getWidth();
float pageHeight = page.getMediaBox().getHeight();

PDPageContentStream

The PDPageContentStream class provides the function of writing the page content stream, which needs to bind the pdf document and the specified page page, which is equivalent to creating the content stream of the current page of the page.

PDPageContentStream contentStream = new PDPageContentStream(document, page);

If PDPageContentStream.AppendMode is not specified, it will be executed in rewrite mode by default, and subsequent addition of elements to the page page will overwrite the existing page content stream.

PDPageContentStream contentStream = new PDPageContentStream(document, page, PDPageContentStream.AppendMode.APPEND, true);

mode code

model

note

PDPageContentStream.AppendMode.OVERWRITE

rewrite mode

Overwrite existing page content flow

PDPageContentStream.AppendMode.APPEND

append mode

Appends the content stream after all existing page content streams

PREPENDDPPageContentStream.AppendMode.

ready mode

Inserted before all other page content flow

After the operation on the contentStream is completed, the content stream needs to be closed.

contentStream. close();

pdf write content

About fonts

In Apache PDFBox, font-related classes are mainly located under the org.apache.pdfbox.pdmodel.font package. Here are some commonly used font classes:

  1. PDType1Font: This class represents a Type 1 font, which is an outline-based font format. Type 1 fonts are commonly used in PDF documents, such as Helvetica, Times Roman, and Courier.

Example:

PDType1Font font = PDType1Font.HELVETICA_BOLD;
public static final PDType1Font TIMES_ROMAN = new PDType1Font("Times-Roman");
public static final PDType1Font TIMES_BOLD = new PDType1Font("Times-Bold");
public static final PDType1Font TIMES_ITALIC = new PDType1Font("Times-Italic");
public static final PDType1Font TIMES_BOLD_ITALIC = new PDType1Font("Times-BoldItalic");
public static final PDType1Font HELVETICA = new PDType1Font("Helvetica");
public static final PDType1Font HELVETICA_BOLD = new PDType1Font("Helvetica-Bold");
public static final PDType1Font HELVETICA_OBLIQUE = new PDType1Font("Helvetica-Oblique");
public static final PDType1Font HELVETICA_BOLD_OBLIQUE = new PDType1Font("Helvetica-BoldOblique");
public static final PDType1Font COURIER = new PDType1Font("Courier");
public static final PDType1Font COURIER_BOLD = new PDType1Font("Courier-Bold");
public static final PDType1Font COURIER_BOLD_OBLIQUE = new PDType1Font("Courier-BoldOblique");
public static final PDType1Font SYMBOL = new PDType1Font("Symbol");
public static final PDType1Font ZAPF_DINGBATS = new PDType1Font("ZapfDingbats");
  1. PDTrueTypeFont: This class represents a TrueType font, which is also an outline-based font format. TrueType fonts are also common in PDFs.

PDTrueTypeFont font = PDType1Font.TIMES_ROMAN;
  1. PDType0Font: This class represents a Type 0 font, which is a composite font format that can contain multiple subfonts. Type 0 fonts are usually used to support multi-language and complex glyph requirements, and you can use it to load your own custom font files.

PDType0Font font = PDType0Font.load(document, new ClassPathResource("/static/wryhRegular.ttf").getInputStream());

Write a single line of text

contentStream.setFont(PDType1Font.COURIER_BOLD_OBLIQUE, 16);
contentStream.beginText();
contentStream.newLineAtOffset(50, pageHeight-50);
contentStream.showText("test text");
contentStream.endText();

Before writing text, you need to set the font and font size through the contentStream.setFont(PDFont font, float fontSize) method, start a new text paragraph through the beginText() method, and set the coordinate position of the text through the newLineAtOffset(x, y); method. Here, setting (50, pageHeight-50) means that the text position is located in the upper left corner, 50 units away from the top and left. Then display the text you need to display through showText(String text), and finally end the text paragraph with the endText() method.

Continuously write multiple lines of text

contentStream.setFont(PDType1Font.HELVETICA_BOLD, 12);

// Set the text start coordinates
float startX = 50;
float startY = page.getMediaBox().getHeight() - 50;

// set line spacing
float leading = 15;

// write multiple lines of text
String[] lines = {
    "The first line of text",
    "The second line of text",
    "The third line of text"
};

contentStream.beginText();
contentStream. newLineAtOffset(startX, startY);

for (String line : lines) {
    contentStream. showText(line);
    contentStream. newLineAtOffset(0, -leading);
}

contentStream.endText();

The process of writing multi-line text is similar to that of single-line text. You need to set the font and font size first, and determine the coordinates of the written text. The difference is that we have executed showText() and newLineAtOffset() multiple times between the beginText() method and endText() method. Add multiple lines of text to a pdf document after many loops.

Insert image

PDImageXObject image = PDImageXObject.createFromFileByExtension(new File("path/to/image.jpg"), document);
float imageWidth = image. getWidth();
float imageHeight = image. getHeight();

PDPageContentStream contentStream = new PDPageContentStream(document, page);
contentStream.drawImage(image, x, y, imageWidth, imageHeight);

Here we use the PDImageXObject.createFromFileByExtension() method to load the image file and create a PDImageXObject object. Make sure the \ “path/to/image.jpg ” is replaced with the path of the actual picture file. Here I set the width and height of the picture to the width of the real picture. In the actual situation, you can also customize the height of the height. Finally Write the picture into the PDF document, x, y represents its XY coordinates, and the later ImageWidth and Imageheight represent the width and height of the picture, respectively.

Add a rectangle

//Set border color
contentStream.setStrokingColor(new Color(213, 213, 213));
//Set border width to 1
contentStream.setLineWidth(1);
// Add a rectangle to the page content flow
contentStream.addRect(50, pageHeight-50, 100, 100);
// Draw the border of the rectangle
contentStream.stroke();
//Restore the original color, otherwise it will affect the text color
contentStream.setStrokingColor(Color.BLACK);

Common methods for calculating text coordinates

 /**
     * Get font height
     * */
    float getFontHeight(PDType0Font customFont, float fontSize){
        return customFont.getFontDescriptor().getFontBoundingBox().getHeight() / 1000 * fontSize;
    }
    /**
    * Calculate text width
    * */
    float getTextWidth(String text, float fontSize){
        return fontSize * text. length();
    }

Attachments

PDFBox Official Documentation (2.0.24)