Foreword
Preview and Download for PDF
type invoices were requested during development, **PDF
** type file sources include H5 mobile terminal
and **PC terminal
**, and for these two different The processing of the terminal will be slightly different, which will be mentioned below.
There are many articles about PDF preview
, but none of them seem to mention the possible problems, or how to choose the corresponding specific demand scenarios. Therefore, the article The core is to see which of the current implementation solutions is more suitable in combination with the actual demand scenario. Of course, I hope that everyone can correct the content in the article in the comment area, or provide a better solution.
Basic Requirements:
-
Support for full preview of the content of
pdf files
-
Multi-page pdf files
supportpage view
-
PC
andmobile
need to support download and preview< /strong>
Product Requirements:
-
The PC preview should support previewing on the current page
-
pdf file
The font in the preview should be consistent with the font of the actual file
PDF preview
Putting aside the various requirements above, let’s first summarize several common ways to realize PDF
preview:
-
With the help of various class libraries, code-based preview, such as the package based on **`pdfjs-dist`**[1]
-
Directly based on the built-in
PDF
preview plugin of each browser, such as -
The server converts the
PDF
file into an image
Next, let’s take a look at how the above solutions are implemented and whether they meet the requirements provided above!
achieve preview
tag
The element embeds external content at a specified location in the document, generated by an external application or other source of interactive content (such as browser plug-ins).
To put it simply, the resource displayed by using is completely provided by the display function provided by its environment, that is, if the current application environment supports the display of this resource, it can be displayed normally. It cannot be displayed if it is not supported.
It is also very simple to use:
<embed type="application/pdf" :src="pdfUrl" width="800" height="600" /> Copy Code
Most modern browsers have deprecated and canceled the support for browser plug-ins, and now it is not recommended to use
tags, but you can use
,
and other tags instead.
tag
The method based on is similar to the above, and the overall effect is also the same, so it will not be shown here:
<iframe :src="pdfUrl" width="800" height="600" /> Copy Code
It is worth noting that even if is used, after actually expanding its inner structure, you will find:
Is it inside the tag? What’s going on here, doesn’t it mean that
is best deprecated?
First, check the compatibility in **`caniuse`**[2], as follows:
Let’s find another browser that does not support , such as
IE
, to try the effect:
Try replacing it with , as follows:
Obviously, cannot be displayed directly in an incompatible environment, while
can be recognized normally, but
The loaded resource cannot be processed by
IE
browser, that is, the essential reason is that IE
browser does not support similar Preview of files such as PDF
, such as when trying to enter http://127.0.0.1:3000/src/assets/2.pdf
directly in the address bar will get :
Therefore, under normal circumstances, when the browser does not support inline PDF
, it should provide a fallback link of PDF
, which is realized by downloading, and this is * *pdfobject**[3] does, in fact, its source code content is relatively simple, the core is that PDFObject will detect the browser’s support for inline/embedded PDF, if it supports embedding, If the browser does not support embedding, the PDF will not be embedded and a fallback link to the PDF will be provided, such as in IE
:
In fact, this is just to help us write less compatible code, and it does not necessarily meet most people’s scenarios. It is mentioned here only because of the existence of connect.
vue3-pdfjs implementation preview
Why not just use pdfjs-dist
?
**pdf.js**[4] A few obvious points to complain about:
-
The package name is not uniform, the package on
npm
is calledpdfjs-dist
, but inReadme
it is also calledpdf. js
-
There is no clear document as a guide, only the contents of the
examples
directory in its warehouse can be used as a reference -
The official examples are not friendly enough, for example, no relevant examples such as
vue/react
are provided -
Direct use needs to introduce a lot of content that is not specified in the document
-
Sometimes displayed
pdf
content is blurry or missing parts, etc. -
…
Therefore, since there is already a package based on vue/react
, it is directly used as a demonstration here.
Specific use
For the installation and usage process, please refer to **`vue3-pdfjs`**[5] , the specific Vue3
sample code is as follows:
<script setup lang="ts"> import { onMounted, ref } from 'vue' import { VuePdf, createLoadingTask } from 'vue3-pdfjs/esm' import type { VuePdfPropsType } from 'vue3-pdfjs/components/vue-pdf/vue-pdf-props' // Prop type definitions can also be imported import type { PDFDocumentProxy } from 'pdfjs-dist/types/src/display/api' import pdfUrl from './assets/You-Dont-Know-JS.pdf' const pdfSrc = ref<VuePdfPropsType['src']>(pdfUrl) const numOfPages = ref(0) onMounted(() => { const loadingTask = createLoadingTask(pdfSrc. value) loadingTask.promise.then((pdf: PDFDocumentProxy) => { numOfPages.value = pdf.numPages }) }) </script> <template> <VuePdf v-for="page in numOfPages" :key="page" :src="pdfSrc" :page="page" /> </template> <style> @import '@/assets/base.css'; </style> Copy Code
The effect is as follows:
There is a problem
It seems that there is no major problem in loading a normal pdf document
, let’s try loading a pdf invoice
to see, but because the actual invoice has more sensitive information There are many, so I won’t post the original invoice content here, just look at the previewed invoice content:
-
Obviously, the contents of the overall invoice are missing a lot. Although most of some invoices can be displayed, the invoice title and stamp may not be normal display etc.
【Note】The complete content cannot be displayed because
pdf.js
requires the support of some font libraries, if some fonts in theoriginal PDF file
do not The matching font library will not be displayed inpdf.js
, and the font library is stored in thecmaps
folder
-
In addition, the previewed font and the actual font are inconsistent, and due to the particularity of the invoice, there is greater consistency in the font After all, if the font of the same invoice is inconsistent, it will lack standardization and legality (~~
The statement when the font is required to be consistent
~~)
Common solutions: **Solve the problem that pdf.js cannot fully display the content of the pdf file**[6], in fact, it is still analyzed according to the error information of the execution environment, and the source code content needs to be forcibly modified.
Mozilla Firefox
The built-in PDF reader of Mozilla Firefox is actually pdf.js
, you can preview the pdf
file directly with Firefox browser, as follows:
And most of the libraries vue-pdf, vue3-pdfjs
based on the secondary packaging of pdf.js
usually cannot be displayed when previewing the invoice of the pdf
file The complete content requires more or less changes to the source code, but the built-in pdf.js
in Firefox
can completely display the corresponding pdf
the contents of the file.
PDF
convert image
to achieve preview
This method should go without saying. The core is that when the server responds to the pdf
file, it first converts it into an image type and then returns it. The front end can directly display the specific image content.
Concrete implementation
The following is simulated by using node
:
const pdf = require('pdf-poppler') const path = require('path') const Koa = require('koa') const koaStatic = require('koa-static') const cors = require('koa-cors') const app = new Koa() // cross domain app. use(cors()) // static resource app.use(koaStatic('./server')) function getFileName(filePath) { return filePath .split('/') .pop() .replace(/\.[^/.] + $/, '') } function pdf2png(filePath) { // get the file name const fileName = getFileName(filePath); const dir = path.dirname(filePath); // configuration parameters const options = { format: 'png', out_dir: dir, out_prefix: fileName, page: null, } // convert pdf to png return pdf .convert(filePath, options) .then((res) => { console.log('Successfully converted!') return `http://127.0.0.1:4000${dir.replace('./server','')}/${fileName}-1.png` }) .catch((error) => { console. error(error) }) } // response app. use(async (ctx) => { if(ctx.path.endsWith('/getPdf')){ const url = await pdf2png('./server/pdf/2.pdf') ctx.body = {url} }else{ ctx.body = 'hello world!' } }) app.listen(4000) Copy Code
Avoid stepping on some pitfalls
Pit 1: pdf-image is not recommended
When the server converts pdf
files into images, it needs to rely on some third-party packages. At the beginning, **`pdf-image`**[7] was used package, but many abnormal errors occurred during the actual conversion. After checking the source code along the errors, it was found that it needs to rely on some additional tools, because it needs to use pdfinfo xxx
related commands, and its corresponding There are also some similar problems on `issue`[8], but after trying all of them, it still doesn’t work!
Therefore, it is more recommended to use `pdf-poppler`[9] which comes with a pdftocairo
program that can convert pdf
to images. However, its current version supports Windows and Mac OS, as follows:
Pit 2: path.basename not a function
In the above code content, we need to get the name of the file. In fact, we can simply use path.basename(path[, suffix])
in Node Api
to achieve the goal :
However, the following exception occurred when the program was running, and the corresponding code content and running results are as follows:
// configuration parameters const options = { format: 'png', out_dir: dir, out_prefix: path.baseName(filePath, path.extname(filePath)), // an exception occurred page: null, } Copy Code
I haven’t found the reason for this yet, so I can simply implement a getFileName
method to get the name of the file.
Reason for error: rely too much on the automatic prompt of the editor, output basename as baseName, yes it is the difference between n and N.
Pit 3: Details
The above content starts the simulated business service through koa
, because business service (http://127.0.0.1:4000
) and application service (http://127.0.0.1:3000
) The port is inconsistent, so it will generate cross-domain, you can pass koa-cors
to solve, it is worth noting that sometimes koa-cors
may not work when the business server is restarted.
Since the content of the response is directly returned in koa
general middleware, if you need to support business services to provide static resource access, you can use koa- static
, it is worth noting that when you specify static file resources through koa-static
, such as **app.use(koaStatic('./static\ '))
**, if you directly pass http://127.0.0.1:4000/static/pdf/xxx.png
, you will get 404 Not Found error, the reason is that koa-static
directly sets /static/ to root path strong>, so the correct access path is: http://127.0.0.1:4000/pdf/xxx.png
.
Effect demonstration
The content of the invoice is inconvenient to display and will not be displayed directly here. You only need to pay attention to the generated pictures and paths:
PDF download
The download here actually not only refers to the download of pdf
, but also the download methods supported by the client side. The most common ones are as follows:
-
a tag, eg
download
-
location.href, eg
window.location.href = xxx
-
window.open, such as
window.open(xxx)
-
Content-disposition, for example
Content-disposition: attachment; filename="xxx"
realize download
The download
attribute of is used to instruct the browser to download the URL specified by href instead of navigating to the resource, usually prompting the user to put the It is saved as a local file. If the
download
attribute has specified content, this value will be used as the pre-filled file name during the download and save process, mainly because of the following reasons:
-
This value may be dynamically modified via
JavaScript
-
Or the
download
attribute specified inContent-Disposition
takes precedence overa.download
This should be the most familiar method for everyone, but familiarity is familiar, and there are some points worth noting:
-
The
download
attribute only applies to same-origin URLs-
Same origin URL will download
-
Non-same-origin URL will navigate
-
Non-same-origin resources still need to be downloaded, so they can be converted to **`blob: URL`**[10] and **`data: URL` **[11] form
-
-
If a different file name is specified in the **`Content-Disposition`**[12] attribute in the HTTP response header, then < content in code>Content-Disposition
-
HTTP If **`Content-Disposition`**[13] in the HTTP response header is set to
Content-Disposition='inline'
, then thedownload
attribute ofContent-Disposition
will be used first in Firefox
Static mode:
<a href="http://127.0.0.1:4000/pdf/2-1.png" download="2.pdf">Download</a> Copy Code
Dynamic method:
function download(url, filename){ const a = document.createElement("a"); // create a tag a.href = url; // download path a.download = filename; // download attribute, filename a.style.display = "none"; // not visible document.body.appendChild(a); // mount a.click(); // trigger click event document.body.removeChild(a); // remove } Copy Code
Blob method
if (reqConf. responseType == 'blob') { // return the file name let contentDisposition = config. headers['content-disposition']; if (!contentDisposition) { contentDisposition = `;filename=${decodeURI(config.headers.filename)}`; } const fileName = window.decodeURI(contentDisposition.split(`filename=`)[1]); // file type const suffix = fileName. split('.')[1]; // create blob object const blob = new Blob([config.data], { type: FileType[suffix], }); const link = document.createElement('a'); link.style.display = 'none'; link.href = URL.createObjectURL(blob); // create url object link.download = fileName; // file name after download document.body.appendChild(link); link. click(); document.body.removeChild(link); // remove the hidden a tag URL.revokeObjectURL(link.href); // destroy the url object } Copy Code
Content-disposition
and location.href/window.open
realize download
This seems to be three download methods, but it is actually one, and it is still based on Content-disposition
.
Content-Disposition
response header Indicates in what form the content of the reply should be displayed, in the form of inline (that is, a web page or part of a page ), or download as an attachment and save it locally, as follows:
-
inline
: is the default value, indicating that the message in the reply will be displayed in the form of a part of the page or the entire pageContent-Disposition: inline Copy Code
-
attachment
: Setting this value means that the message body should be downloaded locally, and most browsers will present a “Save As” dialog withfilename
The value of is prefilled with the downloaded file nameContent-Disposition: attachment; filename="filename.jpg" Copy Code
Therefore, the method based on location.href='xxx'
and window.open(xxx)
can Downloading is based on the form of Content-Disposition: attachment; filename="filename.jpg"
, or it triggers the download behavior of the browser itself, which meets this condition, whether it is through a
tag jump, location.href navigation, window.open opens a new page, directly in Enter URL on the address bar to download.
H5 mobile terminal download
The H5
mobile terminal can implement the preview operation based on the above methods, but the download operation is different, because this is To differentiate scenarios:
-
Based on mobile browser
-
Based on WeChat built-in browser
The download method based on the mobile browser is basically the same as the content mentioned above. In essence, as long as the client supports downloading, there is no problem. However, in the WeChat built-in browser< /strong> You may not be able to meet expectations using the normal download method:
-
Using the normal download method in
Android
, a dialog box will pop up, asking if you need to wake up the mobile browser to download the corresponding resource , some models do not -
In
IOS
none of the above methods can download, so usually a newwebview
to provide a preview, some models support long press the screen in the new page to save the operation, but not all models support
The essential reason is to block any download link in the WeChat built-in browser, such as APP download link, ordinary file download link and so on.
How else can I download the H5 mobile terminal?
Since this is the shielding of the download function by the WeChat built-in browser environment, there is no need to think about it (~~I dare not even think about it
~~) Based on the WeChat built-in browser device to implement the download function, and what should be considered is how to implement indirect download:
-
Determine whether it belongs to the WeChat built-in browser, and if so, help the user to automatically activate the mobile browser for downloading, but not all models support activation strong> operation, so it is best to prompt the user to download directly through the mobile browser. For the convenience of the user, you can implement the function of one-key copy to assist
-
The other is to directly prompt that only
PC
downloads are supported, and the download operation for mobile terminals is abandoned
Last
In summary, there may be no way to achieve perfection in the process of implementing pdf
preview, especially for pdf
invoice > file, the following problems still exist:
-
Download is not guaranteed for
h5
mobile devices -
There is no guarantee that when
pdf
previews, the previewed font will match the actual invoice font
Most of the existing preview methods are implemented based on pdf.js
, and pdf.js
internally uses PDFJs.getDocument(url/buffer)
The method is based on file address or data stream to obtain the content, and then render the pdf
file through canvas
, thank you If you are interested, you can study the source code of pdf.js
.
pdf.js
brings a related problem that if the corresponding pdf
file contains fonts that do not exist in pdf.js
, then it cannot be complete Rendering, in addition, there will be differences between the rendered font and the original pdf
file font.
For these two points, it is found that Google’s built-in pdf
plug-in seems to provide good support, which means that if other browsers include Google-related plug-ins (such as: Edge, QQ Browser), you can Realize the preview directly based on , or you can only view the source file by downloading for stricter font consistency.
What if the product requirements cannot be met?
For example, the solutions discussed above cannot actually meet some of the requirements mentioned at the beginning of the article. The purpose of product requirements is also to provide better user experience (~~
under normal circumstances
), but these requirements still need to be implemented in technology, and we need timely feedback on the level of technical support (Unless your product is technical experience
~~), so as a developer, you need to provide sufficient content to prove to the product, and then give some indirect implementation solutions yourself (or the product itself gives come up with a new plan
) to see if it meets the second expectation, the core is reasonable communication + other plans (Everyone’s situation is different, the actual situation Maybe... know everything
).
The above are some personal opinions and understandings. If there is any inappropriateness, you can correct it in the comment area! ! !
Hope this article is helpful to you! ! !
About this article
Author: Xiong Mao
https://juejin.cn/post/7207078219215732794