The pitfalls encountered in the use of pdfbox, unable to render, rendering inversion problem, to help you avoid detours

hello! Hello everyone, I am “Singularity”, people in the rivers and lakes call singularity. I have just worked for a few years and want to make progress together with everyone

A self-motivated [Java ToB terminal big factory field blogger]!

I like java and python. I am usually lazy. If I can solve it with a program, I will never solve it manually.

? If there is a [cute] who is interested in [java], please follow me

Thank you all cute and cute!

Since I am in charge of the back end of the project recently, I have no time to update the article recently. I would like to apologize to everyone here. I will gradually learn to manage events, coordinate work and sharing, and try to share more articles with you. You are also welcome to join me in learning and progressing.

The project recently used pdfbox. Due to lack of experience in this area and because there are many users using itext, there are relatively few articles on pdfbox. However, due to the problem of the itext open source protocol, the project uses pdfbox, but there are relatively few related articles in pdfbox in China. Since there are all kinds of ghosts and snakes in the project, it is not a detour in the rendering process of pdf, and also encountered various messy problems. Here is a summary for everyone to refer to.

First, list down the needs and problems encountered

Render the pdf normally, and set the variable value in the pdf according to the position passed by the front end

Automatic line break is required when pdf is rendered (this is just an idea due to haste)

question

The content part of pdf rendering will be inverted

Part of the pdf file will have scaling problems

The part of the file in the pdf will not be rendered

Damn, so many inexplicable problems are really a headache, but there are many users, here we can only solve these problems one by one according to the needs of users, mmp, when will the company think about it for employees. md, just make a brief complaint, but in such a big environment, I can only complain here, I hope it will not affect everyone’s mentality.

Okay, let’s get down to business, let’s enter today’s text, the use of pdfbox.

The rendering part is relatively simple, and there are relatively many articles on the Internet. Here is just a brief description, and the core code is provided to everyone.

@Data
@ToString
public class ReplaceRegion {
    /**
     * Uniquely identifies
     */
    private String id;
    /**
     * replace content
     */
    private String replaceText;
    /**
     * x-coordinate
     */
    private Float x;
    /**
     * y-coordinate
     */
    private Float y;
    /**
     * width
     */
    private Float w;
    /**
     * high
     */
    private Float h;
    /**
     * Font properties
     */
    private FontValue fontValue;
}

The above is the content object of the whole experience, with id and replacement text position and text information

public class PdfboxReplace {
    private static final Integer CAPACITY = 1 << 4;
    private static final Logger log = LoggerFactory. getLogger(PdfboxReplace. class);
    /**
     * output stream
     */
    private ByteArrayOutputStream output;
    /**
     * pdf text
     */
    private PDDocument document;
    /**
     * text stream
     */
    private PDPageContentStream contentStream;
    /**
     * count from page 0
     */
    private static final Integer DECREASE_ONE = 1;
    /**
     * Set the font default font size
     */
    private int FONT_SIZE = 12;



    public PdfboxReplace(PDDocument document) {
        this.document=document;
        output = new ByteArrayOutputStream();
    }

    private PdfboxReplace(byte[] pdfBytes) throws IOException {
        init(pdfBytes);
    }

    private void init(byte[] pdfBytes) throws IOException {
        log.info("===========[pdf area replacement initialization starts]=================");
        document = PDDocument.load(pdfBytes,null,null,null, MemoryUsageSetting.setupTempFileOnly());
        output = new ByteArrayOutputStream();
        log.info("===========[pdf area replacement initialization completed]================");
    }

    /**
     * Replace text according to custom area
     *
     * @throws IOException
     * @throws
     */
     private void process(Map<Integer, List<ReplaceRegion>> replaceRegionMap) throws IOException {
        try {
            //Cache the font of the current file
            Map<String,PDType0Font> fontCache = new ConcurrentHashMap<>();
            for (Entry<Integer, List<ReplaceRegion>> entry : replaceRegionMap. entrySet()) {
                //Set the current operation page number, counting from page 0
                PDPage page = document.getPage(entry.getKey() - DECREASE_ONE);
                contentStream = new PDPageContentStream(document, page, PDPageContentStream. AppendMode. APPEND, false, true);
                for (ReplaceRegion region : entry. getValue()) {
                    Float cursorX = region. getX();
                    Float cursorY = region. getY();
                    //Draw a rectangle as a background cover, not used for now
                    //content.setNonStrokingColor(Color.WHITE);
                    //content.addRect(cursorX, cursorY, 100, cursorY + 100);
                    //content.fill();
                    //content. saveGraphicsState();
                    /**Add text*/
                    contentStream.setNonStrokingColor(Color.BLACK);
                    contentStream.beginText();
                    //Set text properties
                    String fontKey = region.getFontValue().getSize() + region.getFontValue().getFontStyle();
                    //The cache hit goes directly to the font cache
                    PDType0Font font = null;
                    if (fontCache. keySet(). contains(fontKey)) {
                        font = fontCache. get(fontKey);
                    } else {
                        InputStream fontInfo = getFontInfo(region. getFontValue());
                        font = PDType0Font.load(document, fontInfo);
                        fontCache. put(fontKey, font);
                    }
                    //Set font size and font
                    contentStream.setFont(font, region.getFontValue().getSize() != null ? region.getFontValue().getSize() : FONT_SIZE);
                    font.encode("utf8");
                    contentStream.newLineAtOffset(cursorX, cursorY + 3);
                    contentStream. showText(region. getReplaceText());
                    contentStream. saveGraphicsState();
                    contentStream. endText();
                }
                contentStream. close();
            }
            document. save(output);
        } catch (Exception e) {
            log.error("pdf process error: {}{}", e.getMessage(), e);
            throw new Exception("replace pdf content exception");
        } finally {
            if (contentStream != null) {
                contentStream. close();
            }
            if (document != null) {
                document. close();
            }
        }
    }

    /**
     * Setting parameters
     *
     * @param x
     * @param y
     * @param text replacement text
     */
    public ReplaceRegion replaceText(float x, float y, float w, float h, String text, FontValue fontValue) {
        // use text as alias
        ReplaceRegion region = new ReplaceRegion(text);
        region.setH(h);
        region.setW(w);
        region.setX(x);
        region.setY(y);
        region.setFontValue(this.getFontVale(fontValue));
        return region;
    }

    /**
     * Get font properties
     *
     * @param fontValue
     * @return
     */
    public FontValue getFontVale(FontValue fontValue) {
        if (fontValue != null) {
            fontValue.setSize(fontValue.getSize() == null ? FONT_SIZE : fontValue.getSize());
            fontValue.setFontStyle(StringUtils.isBlank(fontValue.getFontStyle()) ? FontEnum.SIM_SUN.getCode() : fontValue.getFontStyle());
        } else {
            fontValue = new FontValue();
            fontValue.setSize(FONT_SIZE);
            fontValue.setFontStyle(FontEnum.SIM_SUN.getCode());
        }
        return fontValue;
    }

    /**
     * Replace pdf text area
     *
     * @param regions region parameters key: page number, value: region parameters
     */
    public byte[] PdfReplaceRegion(Map<Integer, List<ReplaceRegion>> regions) {
        //The text area data information to be replaced
        Map<Integer, List<ReplaceRegion>> replaceRegionMap = new ConcurrentHashMap<>(CAPACITY);
        for (Map.Entry<Integer, List<ReplaceRegion>> mapEntry : regions.entrySet()) {
            List<ReplaceRegion> replaceRegionList = new ArrayList<>();
            if (!CollectionUtils. isEmpty(regions)) {
                for (ReplaceRegion region : mapEntry. getValue()) {
                    replaceRegionList.add(this.replaceText(region.getX(), region.getY(), region.getW(), region.getH(), region.getReplaceText(), region.getFontValue()));
                }
            }
            replaceRegionMap.put(mapEntry.getKey(), replaceRegionList);
        }
        try {
            //Get the generated pdf stream
            return this.toPdf(replaceRegionMap);
        } catch (IOException e) {
            log. error(e. getMessage(), e);
            throw new Exception("copy conversion pdf exception");
        }
    }



    /**
     * Replace pdf text area
     *
     * @param regions region parameters key: page number, value: region parameters
     * @param pdfBytes source file bytecode
     */
    public byte[] PdfReplaceRegion(Map<Integer, List<ReplaceRegion>> regions, byte[] pdfBytes) {
        //The text area data information to be replaced
        Map<Integer, List<ReplaceRegion>> replaceRegionMap = new ConcurrentHashMap<>(CAPACITY);
        PdfboxReplace pdPlacer;
        try {
            pdPlacer = new PdfboxReplace(pdfBytes);
        } catch (IOException e) {
            log. error(e. getMessage(), e);
            throw new GlobalException( "replace pdf area file error");
        }
        for (Map.Entry<Integer, List<ReplaceRegion>> mapEntry : regions.entrySet()) {
            List<ReplaceRegion> replaceRegionList = new ArrayList<>();
            if (!CollectionUtils. isEmpty(regions)) {
                for (ReplaceRegion region : mapEntry. getValue()) {
                    replaceRegionList.add(pdPlacer.replaceText(region.getX(), region.getY(), region.getW(), region.getH(), region.getReplaceText(), region.getFontValue()));
                }
            }
            replaceRegionMap.put(mapEntry.getKey(), replaceRegionList);
        }
        try {
            //Get the generated pdf stream
            return pdPlacer.toPdf(replaceRegionMap);
        } catch (IOException e) {
            log. error(e. getMessage(), e);
            throw new Exception("copy conversion pdf exception");
        }
    }

    /**
     * Get font stream
     *
     * @param fontValue
     * @return
     */
    private InputStream getFontInfo(FontValue fontValue) {
        InputStream resourceAsStream = null;
        //custom font
        if (fontValue != null & amp; & amp; StringUtils. isNotBlank(fontValue. getFontStyle())) {
            resourceAsStream = Thread.currentThread().getContextClassLoader().getResourceAsStream(FontEnum.getValue(fontValue.getFontStyle()));
            return resourceAsStream;
        }
        //Default Arial
        resourceAsStream = Thread.currentThread().getContextClassLoader().getResourceAsStream(FontEnum.getValue(FontEnum.SIM_SUN.getValue()));
        return resourceAsStream;
    }

    /**
     * Generate a new PDF file
     *
     * @param replaceRegionMap needs to replace the data information
     * @return
     * @throws IOException
     */
    public byte[] toPdf(Map<Integer, List<ReplaceRegion>> replaceRegionMap) throws IOException {
        try {
            //alternative method
            this.process(replaceRegionMap);
            log.info("===========[pdf file generated successfully]===============");
            return output.toByteArray();
        } catch (IOException e) {
            log. error(e. getMessage(), e);
            throw e;
        } finally {
            //Close the resource
            if (output != null) {
                output. close();
            }
        }
    }
}

This is the method of the entire tool class rendered, the core part of which is the process method

/**
     * Replace text according to custom area
     *
     * @throws IOException
     * @throws
     */
     private void process(Map<Integer, List<ReplaceRegion>> replaceRegionMap) throws IOException {
        try {
            //Cache the font of the current file
            Map<String,PDType0Font> fontCache = new ConcurrentHashMap<>();
            for (Entry<Integer, List<ReplaceRegion>> entry : replaceRegionMap. entrySet()) {
                //Set the current operation page number, counting from page 0
                PDPage page = document.getPage(entry.getKey() - DECREASE_ONE);
                contentStream = new PDPageContentStream(document, page, PDPageContentStream. AppendMode. APPEND, false, true);
                for (ReplaceRegion region : entry. getValue()) {
                    Float cursorX = region. getX();
                    Float cursorY = region. getY();
                    //Draw a rectangle as a background cover, not used for now
                    //content.setNonStrokingColor(Color.WHITE);
                    //content.addRect(cursorX, cursorY, 100, cursorY + 100);
                    //content.fill();
                    //content. saveGraphicsState();
                    /**Add text*/
                    contentStream.setNonStrokingColor(Color.BLACK);
                    contentStream.beginText();
                    //Set text properties
                    String fontKey = region.getFontValue().getSize() + region.getFontValue().getFontStyle();
                    //The cache hit goes directly to the font cache
                    PDType0Font font = null;
                    if (fontCache. keySet(). contains(fontKey)) {
                        font = fontCache. get(fontKey);
                    } else {
                        InputStream fontInfo = getFontInfo(region. getFontValue());
                        font = PDType0Font.load(document, fontInfo);
                        fontCache. put(fontKey, font);
                    }
                    //Set font size and font
                    contentStream.setFont(font, region.getFontValue().getSize() != null ? region.getFontValue().getSize() : FONT_SIZE);
                    font.encode("utf8");
                    contentStream.newLineAtOffset(cursorX, cursorY + 3);//Adjust according to your actual situation
                    contentStream. showText(region. getReplaceText());
                    contentStream. saveGraphicsState();
                    contentStream. endText();
                }
                contentStream. close();
            }
            document. save(output);
        } catch (Exception e) {
            log.error("pdf process error: {}{}", e.getMessage(), e);
            
            if (contentStream != null) {
                contentStream. close();
            }
            if (document != null) {
                document. close();
            }
        }
    }

Among them, the construction method of the pdf content stream is the problem I encountered. Many problems are caused by unfamiliarity with this construction method. Here I will explain it to you, and it is also the key to follow-up problem solving. I use 5 parameters here. construction method,

contentStream = new PDPageContentStream(document, page, PDPageContentStream. AppendMode. APPEND, false, true);

public PDPageContentStream(PDDocument document, PDPage sourcePage, AppendMode appendContent, boolean compress, boolean resetContext)

The most important of these are the two parameters AppendMode appendContent and boolean resetContext

AppendMode is an enumeration class

 /**
         * Overwrite the existing page content streams.
         */
        OVERWRITE,
        /**
         * Append the content stream after all existing page content streams.
         */
        APPEND,
        /**
         * Insert before all other page content streams.
         */
        PREPEND;

There are 3 optional values. Due to the arrangement of these pits, I think that this pdfbox is to render the flow layer by layer and render the entire pdf file page for rendering.

OVERWRITE, this is an overwrite operation, that is, your newly added variable will overwrite the content in the previous pdf file, and the entire page will only display your newly added content (use with caution)

APPEND This is to append content streams after all existing page content streams. That is, the content you add is rendered on the last layer. In this case, we will add our newly added content to the pdf in the blank space. This is our commonly used option

PREPEND This is inserted before all other page content flows. That is to say, contrary to APPEND, the content will be inserted into the pdf file first. The main problem is that our newly added content may be covered by special content, such as horizontal lines, etc., resulting in changes and content rendering. The root cause It’s the coverage, not the inability to render

Among them, the resetContext parameter also needs to be set. Here we have to mention the scaling and inversion issues. If you do not set this parameter at the beginning, use 4 construction methods. The default here is false, which is to reset the container content. In essence, this inversion problem is that the coordinate origin of the unreasonable pdf is unreasonable. Some coordinate origins are in the upper left corner, and some files have the origin in the lower left corner, which leads to inversion problems when we render.

Some articles say to use the APPEND parameter. This will indeed reset the origin of the coordinates of the rendered content, but there will be a problem that the rendering cannot be done. Yes, the following is the original text

https://stackoverflow.com/questions/27919436/pdfbox-pdpagecontentstreams-append-mode-misbehaving

There are two ways to solve the inversion problem. I set it using the construction method. Of course, you can also use the following method to save and restore the graphics state in the first content stream by calling

saveGraphicsState();
//...
restoreGraphicsState();

word wrap problem

The idea here is to set up a component that can drag the size on the front end, tell the length to the back end, and the back end calculates according to the font and component width, and how many words are in each line for automatic line wrapping. Of course, there is a problem with symbols. signed will use less than literal

 private void process(Map<Integer, List<ReplaceRegion>> replaceRegionMap) throws IOException {
        try {
            //Cache the font of the current file
            Map<String,PDType0Font> fontCache = new ConcurrentHashMap<>();
            for (Entry<Integer, List<ReplaceRegion>> entry : replaceRegionMap. entrySet()) {
                //Set the current operation page number, counting from page 0
                PDPage page = document.getPage(entry.getKey() - DECREASE_ONE);
                contentStream = new PDPageContentStream(document, page, PDPageContentStream. AppendMode. APPEND, false);
                for (ReplaceRegion region : entry. getValue()) {
                    Float cursorX = region. getX();
                    Float cursorY = region. getY();
                    //Draw a rectangle as a background cover, not used for now
                    //content.setNonStrokingColor(Color.WHITE);
                    //content.addRect(cursorX, cursorY, 100, cursorY + 100);
                    //content.fill();
                    //content. saveGraphicsState();
                    //TODO calculates the number of words per line according to the length
                    //The distance to move down is also calculated according to the variable
                    //loop rendering
                    List<String> strList = MyStringSpitUtil.getStrList(region.getReplaceText(), 20);
                    for (int i = 0; i < strList. size(); i ++ ) {
                        /**Add text*/
                        contentStream.setNonStrokingColor(Color.BLACK);
                        contentStream.beginText();
                        //Set text properties
                        String fontKey = region.getFontValue().getSize() + region.getFontValue().getFontStyle();
                        //The cache hit goes directly to the font cache
                        PDType0Font font = null;
                        if (fontCache. keySet(). contains(fontKey)) {
                            font = fontCache. get(fontKey);
                        } else {
                            InputStream fontInfo = getFontInfo(region. getFontValue());
                            font = PDType0Font.load(document, fontInfo);
                            fontCache. put(fontKey, font);
                        }
                        //Set font size and font
                        contentStream.setFont(font, region.getFontValue().getSize() != null ? region.getFontValue().getSize() : FONT_SIZE);
                        font.encode("utf8");
                        contentStream.newLineAtOffset(cursorX, cursorY + 3);
// contentStream. showText(region. getReplaceText());
                        contentStream. showText(strList. get(i));
                        contentStream. saveGraphicsState();
                        contentStream. endText();
                        cursorY = cursorY - 20;
                    }
// //Set font size and font
// contentStream.setFont(font, region.getFontValue().getSize() != null ? region.getFontValue().getSize() : FONT_SIZE);
// font.encode("utf8");
// contentStream. newLineAtOffset(cursorX, cursorY + 3);
// contentStream. showText(region. getReplaceText());
// contentStream. saveGraphicsState();
// contentStream. endText();
// cursorY=cursorY-20;
                }
                contentStream. close();
            }
            document. save(output);
        } catch (Exception e) {
            log.error("pdf process error: {}{}", e.getMessage(), e);
            throw new GlobalException(ResultCode.FAIL, com.yonyou.iuap.ucf.common.i18n.MessageUtils.getMessage("P_YS_PF_ECON-SERVICE_0001163006") /* "Replace pdf content exception" */);
        } finally {
            if (contentStream != null) {
                contentStream. close();
            }
            if (document != null) {
                document. close();
            }
        }
    }

public class MyStringSpitUtil {

    public static List<String> getStrList(String inputString, int length) {
        int size = inputString. length() / length;
        if (inputString. length() % length != 0) {
            size += 1;
        }
        return getStrList(inputString, length, size);
    }


    /**
     * Split the original string into a list of strings of the specified length
     * @param inputString original string
     * @param length specifies the length
     * @param size specifies the list size
     * @return
     */
    public static List<String> getStrList(String inputString, int length,
                                          int size) {
        List<String> list = new ArrayList<String>();
        for (int index = 0; index < size; index ++ ) {
            String childStr = substring(inputString, index * length,
                    (index + 1) * length);
            list. add(childStr);
        }
        return list;
    }

    /**
     * Split the string, if the starting position is greater than the length of the string, return empty
     * @param str original string
     * @param f start position
     * @param t end position
     * @return
     */
    public static String substring(String str, int f, int t) {
        if (f > str. length()) {
            return null;
        }
        if (t > str. length()) {
            return str. substring(f, str. length());
        } else {
            return str. substring(f, t);
        }
    }
}

Due to time constraints, it is inevitable that there will be small bugs in the writing. I hope everyone can point it out to me. I will modify it as soon as possible. I also hope that the article I wrote can solve the problems in your pdfbox and save you from detours. I am satisfied.

If you think this article is helpful to you, please like it and follow me. If you have any supplements, please comment and exchange. I will try my best to create more and better articles.

The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge. Java skill treeHomepageOverview 107889 people are studying systematically