[PDFBox] PDFBox operates PDF documents to add local pictures, add network pictures, picture width and height adaptive, picture horizontal and vertical center alignment

This article mainly introduces how PDFBox operates PDF documents, including adding local pictures, adding network pictures, adapting picture width and height, and aligning pictures horizontally and vertically.

Directory

1. PDFBox operation picture

1.1. Add local pictures

(1) Case code

(2) Operation effect

(3) Method introduction

1.2. Add network pictures

(1) Case code

(2) Operation effect

1.3. Image width and height adaptive (image scaling)

(1) Image scaling code

1.4, read pictures

(1) Case code

(2) Operation effect


1. PDFBox operation picture

PDFBox can add image objects to PDF documents, use PDImageXObject to represent an image object, and operate on the content of PDF documents, all need to use the PDPageContentStream page content stream object to complete , PDFBox treats all text, pictures, forms and other content in each PDF page as a stream, and completes operations such as adding, deleting, and modifying content through streams. Here we first introduce how to use PDFBox to add image objects to PDF documents.

1.1, add local pictures

(1) Case Code

Add a local picture, that is, read the picture in the current disk, and then write the picture into the PDPageContentStream page content stream. The example code is as follows:

package pdfbox.demo.image;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.common.PDRectangle;
import org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject;

/**
 * @version 1.0.0
 * @Date: 2023/7/15 14:51
 * @Author ZhuYouBin
 * @Description: PDFBox operation picture
 */
public class PDFBoxImageUtil {

    /**
     * Save the picture of the given path to the pdf file
     * @param imgPath image path
     * @param destPdf generated pdf file path
     * @return returns the generated pdf file path
     */
    public static String generateImageToPdf(String imgPath, String destPdf) {
        try {
            // 1. Create a PDF document object
            PDDocument doc = new PDDocument();
            // 2. Create a Page object
            PDPage page = new PDPage(PDRectangle.A4);
            // 3. Create a picture object
            PDImageXObject image = PDImageXObject.createFromFile(imgPath, doc);
            // 4. Create a page content stream and specify which page in which document to operate
            PDPageContentStream stream = new PDPageContentStream(doc, page);
            stream.drawImage(image, 10, 10); // draw the image into the PDF page
            stream.close(); // Close the page content stream
            doc.addPage(page); // add page to PDF document
            doc.save(destPdf); // save the PDF document
            doc.close(); // close the PDF document
        } catch (Exception e) {
            e.printStackTrace();
        }
        return destPdf;
    }

    public static void main(String[] args) {
        String imgPath = "E:\demo\001.jpg";
        String destPdf = "E:\demo\img.pdf";
        generateImageToPdf(imgPath, destPdf);
    }
}

(2) Running effect

(3) Method introduction

Some static methods are mentioned in the PDImageXObject class, the common ones are as follows:

  • createFromFile(imagePath, doc) method: read the image in the local disk by means of a File file.
    • imagePath parameter: the path of the image.
    • doc parameter: PDF document object.
  • getImage() method: returns the BufferedImage image object.
  • getSuffix() method: returns the suffix type of the image, for example: jpg, png, etc.

1.2, add network pictures

PDFBox does not provide a method to read network pictures, but the following method can be used to realize the function of reading network pictures, the idea is as follows:

  • Step 1: Use the URL object to download the network image to the local disk.
  • Step 2: Use the createFromFile() method to read the network image just downloaded from the local disk.

(1) Case code

package pdfbox.demo.image;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.common.PDRectangle;
import org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject;

import java.io.File;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.net.URL;
import java.net.URLConnection;
import java.util.UUID;

/**
 * @version 1.0.0
 * @Date: 2023/7/15 15:01
 * @Author ZhuYouBin
 * @Description: PDFBox manipulates pictures and adds network pictures to PDF documents
 */
public class PDFBoxImageUtil {

    /**
     * Save the picture of the given path to the pdf file
     * @param imgPath image path
     * @param destPdf generated pdf file path
     * @return Whether the generation is successful
     */
    public static String generateImageToPdf(String imgPath, String destPdf) {
        try {
            // 1. Create a PDF document object
            PDDocument doc = new PDDocument();
            // 2. Create a Page object
            PDPage page = new PDPage(PDRectangle.A4);
            // 3. Create a picture object
            PDImageXObject image;
            boolean isTemp = false;
            String tempPath = null;
            if (imgPath.startsWith("http://") || imgPath.startsWith("https://")) {
                isTemp = true;
                tempPath = downloadImage(imgPath, null);
                image = PDImageXObject.createFromFile(tempPath, doc);
            } else {
                image = PDImageXObject.createFromFile(imgPath, doc);
            }
            // 4. Create a page content stream and specify which page in which document to operate
            PDPageContentStream stream = new PDPageContentStream(doc, page);
            stream.drawImage(image, 10, 10); // draw the image into the PDF page
            stream.close(); // close the page content stream
            doc.addPage(page); // add page to PDF document
            doc.save(destPdf); // save the PDF document
            doc.close(); // close the PDF document
            // After the picture is added successfully, the local temporary file needs to be deleted
            if (isTemp) {
                new File(tempPath).delete();
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
        return destPdf;
    }

    /**
     * Download network pictures to local
     * @param imgPath network image address
     * @param fileName file name
     * @return Returns the temporary path of the local image
     */
    public static String downloadImage(String imgPath, String fileName) {
        try {
            URLConnection conn = new URL(imgPath).openConnection();
            String contentType = conn. getContentType();
            System.out.println(contentType);
            // Create a temporary file directory to save pictures
            File file = new File("temp");
            if (!file.exists() & amp; & amp; !file.mkdirs()) {
                throw new RuntimeException("Failed to create temporary directory");
            }
            if (fileName == null || fileName. trim(). equals("")) {
                fileName = UUID.randomUUID().toString();
            }
            InputStream is = conn. getInputStream();
            byte[] data = new byte[1024];
            int len;
            // Download the file to a local temporary directory
            switch (contentType) {
                case "image/jpeg":fileName + = ".jpeg"; break;
                case "image/gif": fileName + = ".gif"; break;
                case "image/webp":
                case "image/png": fileName + = ".png"; break;
            }
            fileName = file.getAbsolutePath() + File.separator + fileName;
            FileOutputStream fos = new FileOutputStream(fileName);
            while ((len = is. read(data)) != -1) {
                fos.write(data, 0, len);
            }
            fos. close();
            is. close();
        } catch (Exception e) {
            e.printStackTrace();
        }
        return fileName;
    }

    public static void main(String[] args) {
        String imgPath = "https://www.toopic.cn/public/uploads/small/1658043938262165804393852.jpg";
        String destPdf = "E:\demo\img.pdf";
        generateImageToPdf(imgPath, destPdf);
    }
}

(2) Running effect

1.3, Adaptive image width and height (image scaling)

We have been able to add pictures to PDF documents before, but we can find that when the size of the pictures we add is too large, the part exceeding the PDF document will be blocked. How to solve this problem? ? ? For this problem, you can use the way of zooming pictures to solve the idea as follows:

  • Step 1: Obtain the actual width and height of the picture (the width and height unit of the picture obtained in JDK is [px], you need to convert [px] into [pt] unit, conversion rule: 1pt = 3/4 px ).
  • Step 2: Obtain the width and height of the PDF document (the width and height obtained in PDFBox use [pt] as the unit).
  • Step 3: Compare the actual width and height of the image with the width and height of the PDF document to calculate the zoom ratio.

(1) Image scaling code

package pdfbox.demo.image;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.common.PDRectangle;
import org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject;

import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.net.URL;
import java.net.URLConnection;
import java.util.UUID;

/**
 * @version 1.0.0
 * @Date: 2023/7/15 15:11
 * @Author ZhuYouBin
 * @Description: PDFBox operates pictures, and the picture width and height are automatically scaled
 */
public class PDFBoxImageUtil {

    /**
     * Save the picture of the given path to the pdf file
     *
     * @param imgPath image path
     * @param destPdf generated pdf file path
     * @return returns the generated pdf file path
     */
    public static boolean generateImageToPdf(String imgPath, String destPdf) {
        try {
            // 1. Create a PDF document object
            PDDocument doc = new PDDocument();
            // 2. Create a Page object
            PDPage page = new PDPage(PDRectangle.A4);
            // 3. Create a picture object
            PDImageXObject image;
            if (imgPath.startsWith("http://") || imgPath.startsWith("https://")) {
                String tempPath = downloadImage(imgPath, null);
                image = PDImageXObject.createFromFile(tempPath, doc);
                imgPath = tempPath;
            } else {
                image = PDImageXObject.createFromFile(imgPath, doc);
            }
            // 4. Create a page content stream and specify which page in which document to operate
            PDPageContentStream stream = new PDPageContentStream(doc, page);
            // Get the width and height of the image
            float[] imageWH = getImageWH(imgPath, page.getMediaBox());
            stream.drawImage(image, imageWH[0], imageWH[1], imageWH[2], imageWH[3]); // draw the image into the PDF page
            stream.close(); // Close the page content stream
            doc.addPage(page); // add page to PDF document
            doc.save(destPdf); // save the PDF document
            doc.close(); // close the PDF document
            return true;
        } catch (Exception e) {
            e.printStackTrace();
        }
        return false;
    }

    /**
     * Get the width and height of the picture, the unit is [pt]
     *
     * @param imgPath image path
     * @param box PDF document page rectangular area object, you can get the width and height of the rectangular area
     * @return Returns the width and height of the zoomed image
     */
    public static float[] getImageWH(String imgPath, PDRectangle box) {
        try {
            File file = new File(imgPath);
            InputStream is = new FileInputStream(file);
            // Determine whether it is a picture on the Internet
            if (imgPath.startsWith("http://") || imgPath.startsWith("https://")) {
                is = new URL(imgPath).openStream();
            }
            BufferedImage bi = ImageIO. read(is);
            // convert px to pt unit
            float xAxis;
            float yAxis;
            int w = bi. getWidth();
            int h = bi. getHeight();
            float width = (float) (w * 3.0 / 4); // here is because 1pt = 3/4 px, pt and px unit conversion
            float height = (float) (h * 3.0 / 4);
            float pw = box.getWidth() - 60; // It doesn't matter if you subtract 60 or not, just set the blank space
            float ph = box.getHeight() - 60; // It doesn't matter if you subtract 60 or not, just set the blank space
            if (width > pw) {
                float scale = pw / width; // scaling column
                width = pw; // width is equal to page width
                height = height * scale; // height is automatically scaled
            } else {
                float scale = ph / height; // scaling column
                height = ph; // height is equal to page height
                width = width * scale; // width is automatically scaled
            }
            // Calculate the display position of the image on the X and Y axes
            xAxis = (box.getWidth() - width) / 2; // Align the X axis to the center
// yAxis = box.getHeight() - height - 10; // 10 pt from the top of the page
            yAxis = (box.getHeight() - height) / 2; // Y axis is vertically centered
            return new float[]{xAxis, yAxis, width, height};
        } catch (Exception e) {
            e.printStackTrace();
        }
        return new float[]{0, 0, 0, 0};
    }

    /**
     * Download network pictures to local
     * @param imgPath network image address
     * @param fileName file name
     * @return Returns the temporary path of the local image
     */
    public static String downloadImage(String imgPath, String fileName) {
        try {
            URLConnection conn = new URL(imgPath).openConnection();
            String contentType = conn. getContentType();
            // Create a temporary file directory to save pictures
            File file = new File("temp");
            if (!file.exists() & amp; & amp; !file.mkdirs()) {
                throw new RuntimeException("Failed to create temporary directory");
            }
            if (fileName == null || fileName. trim(). equals("")) {
                fileName = UUID.randomUUID().toString().replaceAll("-", "");
            }
            InputStream is = conn. getInputStream();
            byte[] data = new byte[1024];
            int len;
            // Download the file to a local temporary directory
            switch (contentType) {
                case "image/jpeg":fileName + = ".jpeg"; break;
                case "image/gif": fileName + = ".gif"; break;
                case "image/webp":
                case "image/png": fileName + = ".png"; break;
            }
            fileName = file.getAbsolutePath() + File.separator + fileName;
            FileOutputStream fos = new FileOutputStream(fileName);
            while ((len = is. read(data)) != -1) {
                fos.write(data, 0, len);
            }
            fos. close();
            is. close();
        } catch (Exception e) {
            e.printStackTrace();
        }
        return fileName;
    }

    public static void main(String[] args) {
        String imgPath = "https://www.toopic.cn/public/uploads/small/1658043938262165804393852.jpg";
        String destPdf = "E:\demo\img.pdf";
        generateImageToPdf(imgPath, destPdf);
    }
}

1.4, read pictures

PDFBox can also read pictures from PDF documents, and then save them to the local disk. To save pictures, you can use the ImageIO class provided in JDK. This class provides a write() method, which can write picture objects to File inside the file.

(1) Case code

package pdfbox.demo.image;

import org.apache.pdfbox.cos.COSName;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDResources;
import org.apache.pdfbox.pdmodel.graphics.PDXObject;
import org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject;

import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.File;

/**
 * @version 1.0.0
 * @Date: 2023/7/15 15:11
 * @Author ZhuYouBin
 * @Description: PDFBox manipulates images, [reads images] from PDF documents, and saves them locally
 */
public class PDFBoxImageUtil {

    /**
     * From the given pdf document, get all the pictures in the specified page and save them to the local directory
     * <p>
     * The pictures in the pdf document are all BASE64 encoded, and what we can get is only the BASE64 string corresponding to the picture.
     * Therefore, it is also necessary to convert the BASE64 string encoding of the picture into the corresponding picture file
     *</p>
     * @param pdfPath PDF document path
     * @param imagePath generated image path and name
     * @param pageNum get the picture of which page
     * @return Returns the local path of the extracted image
     */
    public static String readerImageFromPdf(String pdfPath, String imagePath, int pageNum) {
        try {
            // 1. Load the PDF document
            PDDocument doc = PDDocument. load(new File(pdfPath));
            // 2. Traverse all Pages, find the specified page and get the picture
            int pages = doc. getNumberOfPages();
            for (int i = 0; i < pages; i ++ ) {
                if (i != pageNum) {
                    continue;
                }
                // Get the current Page page
                PDPage page = doc. getPage(i);
                // Get the resource object of the corresponding page
                PDResources resources = page. getResources();
                // Traverse all the contents of the current page to find the image object
                for (COSName cosName : resources. getXObjectNames()) {
                    PDXObject pdxObject = resources. getXObject(cosName);
                    // Determine if it is an image object
                    if (pdxObject instanceof PDImageXObject) {
                        // get image object
                        BufferedImage image = ((PDImageXObject) pdxObject).getImage();
                        // Save to local disk
                        ImageIO.write(image, "JPEG", new File(imagePath));
                    }
                }
            }
            doc.close(); // close the PDF document
        } catch (Exception e) {
            e.printStackTrace();
        }
        return imagePath;
    }

    public static void main(String[] args) {
        String imgPath = "E:\img\002.jpg";
        String destPdf = "E:\demo\img.pdf";
        readerImageFromPdf(destPdf, imgPath, 0);
    }
}

(2) Running effect

At this point, the introduction of PDFBox operation pictures is over.

To sum up, this article is over. It mainly introduces how PDFBox operates PDF documents, including adding local pictures, adding network pictures, adaptive picture width and height, and horizontal and vertical center alignment of pictures.