tinymce editor imports the full version of docx and doc format Word documents

Something to note before reading this article

On the front end, I import a Word document and automatically parse it into HTML and then insert it into the tinymce editor. Here I use mammoth.js to identify the Word content and set it into the editor. Mammoth can only parse Word in .docx format. The current mammoth does not support the .doc format, and subsequent upgrades may add the function of parsing doc.

Why can’t I parse .doc?

The .docx format Word document is a file format based on XML and ZIP compression technology. Its file structure is relatively fixed and simple, and can be parsed and converted through some open source JavaScript libraries.
·
The .doc format Word document is a relatively older version and is a binary format file. The file structure is relatively complex, has high privacy and details, and requires a dedicated Microsoft Office application to fully read it.

Code idea: The docx document is parsed using mammoth and set to the editor. The doc document is processed by Java in the background (I have written an article on java processing of doc before, you can click to read it), and the HTML result is returned and set to the editor. .

For the specific usage of tinymce, please refer to the Chinese operation manual of tinymce. I won’t explain too much here, it is quite simple and you can do it by yourself.

Depends on two third-party JavaScript libraries, namely tinymce.min.js (library), mammoth.browser.js (separate file)
·
You don’t need to download it online. It is available in the download resources of my personal center. It can be downloaded for free. You just need to pay attention to the fact that there are many plug-ins in tinymce.min.js. Maybe my js library is not very complete, but I have the ability to process Word documents on a daily basis. It is absolutely enough. As for pictures, what I deal with here is to directly copy and paste them into the editor. I do not upload pictures. Picture upload needs to be combined with the background operation. After the upload is successful, if the page is deleted, a deletion operation must be done to avoid any problems. Garbage image data.

When uploading Word, a file resource selector will pop up, just select the Word to upload.

Basically, about 90% of the formats can be restored. There will be some incompatibilities. The image will not be automatically resized, but you can set it yourself in the editor.

tinymce.init({<!-- -->
                selector: '#conTextarea', //The accessed container can be a text field or a div, you can operate it at will.
                branding: false,
                promotion: false,
                statusbar: false, // disable status bar
                height: 900,
                language: 'zh-Hans',
                plugins: "image,table,advlist,fullscreen,link,lists,preview,searchreplace,insertdatetime,charmap",//image imagetools
                toolbar: ['fontselect | formatselect | fontsizeselect | forecolor backcolor | bold italic underline strikethrough | image | table | alignleft aligncenter alignright alignjustify | outdent indent | numlist bullist | preview hr | undo redo | fullscreen searchreplace |print | customUploadBtn'],
                file_picker_callback: function(callback, value, meta) {<!-- -->
                    //Open the pop-up window for selecting files
                    var input = document.createElement('input');
                    input.type = 'file';
                    input.accept = 'image/*';

                    input.onchange = function () {<!-- -->
                        var file = input.files[0];
                        //Convert the file to base64 encoding
                        var reader = new FileReader();
                        reader.onloadend = function () {<!-- -->
                            var base64 = reader.result;
                            // Insert base64 encoding into the editor at the current cursor position
                            callback(base64, {<!-- -->
                                alt: ''
                            });
                        };
                        reader.readAsDataURL(file);
                    };
                    input.click();
                },
                setup: function (editor) {<!-- -->
                    //Register custom button
                    editor.ui.registry.addButton('customUploadBtn', {<!-- -->
                        text: 'Upload Word',
                        onAction: function () {<!-- -->
                            var input = document.createElement('input');
                            input.type = 'file';
                            input.accept = '.doc,.docx';
                            //Perform upload file operation
                            input.addEventListener("change", handleFileSelect, false);

                            //Get the base64 data of the uploaded file
                            function arrayBufferToBase64(arrayBuffer) {<!-- -->
                                var binary = '';
                                var bytes = new Uint8Array(arrayBuffer);
                                var len = bytes.byteLength;
                                for (var i = 0; i < len; i + + ) {<!-- -->
                                    binary + = String.fromCharCode(bytes[i]);
                                }
                                return window.btoa(binary);
                            }

                            function handleFileSelect(event) {<!-- -->
                                var file = event.target.files[0];
                                //Get the suffix of the uploaded file. If it is in docx format, use mammoth to parse it.
                                //If not, access the background and pass the file transfer stream base64 to the background.
                                //Generate the file, then use java to parse the doc and return to the front desk
                                var extension = file.name.slice((file.name.lastIndexOf(".") - 1 >>> 0) + 2);
                                if (extension === 'docx') {<!-- -->
                                    readFileInputEventAsArrayBuffer(event, function (arrayBuffer) {<!-- -->
                                        var base64Data = arrayBufferToBase64(arrayBuffer);
                                        console.log(base64Data);
                                        mammoth.convertToHtml({<!-- --> arrayBuffer: arrayBuffer })
                                            .then(displayResult, function (error) {<!-- -->
                                                console.error(error);
                                            });
                                    });
                                } else if(extension === 'doc') {<!-- -->
                                    readFileInputEventAsArrayBuffer(event, function (arrayBuffer) {<!-- -->
                                        //base64 file stream
                                        var base64Data = arrayBufferToBase64(arrayBuffer);
                                        var result = "Background request";
                                        alert(result);
                                        console.log(base64Data);
                                    });
                                    //tinymce's set method adds content to the editor
                                    tinymce.activeEditor.setContent(result);
                                }
                            }

                            function displayResult(result) {<!-- -->
                                //tinymce's set method adds content to the editor
                                tinymce.activeEditor.setContent(result.value);
                            }

                            function readFileInputEventAsArrayBuffer(event, callback) {<!-- -->
                                var file = event.target.files[0];
                                var reader = new FileReader();
                                reader.onload = function (loadEvent) {<!-- -->
                                    var arrayBuffer = loadEvent.target.result;
                                    callback(arrayBuffer);
                                };
                                reader.readAsArrayBuffer(file);
                            }

                            // Trigger the click event and open the file selection dialog box
                            input.click();
                        }
                    });
                }
            })

The content saved by tinymce is saved in the form of html. It is processed here by itself. It can be saved to the background to generate TXT, or it can directly generate html files, or it can be exported to PDF.

Let’s talk about the advantages and disadvantages of tinymce

advantage:

1. Easy to use: TinyMCE’s user interface is similar to traditional word processing software, so it is easy to use. Users can enter text in the text box or add text, images, and other media types directly to the text box.
·
2. Highly customizable: TinyMCE provides a large number of customizable options, allowing developers to set up and configure the editor according to their own needs. It allows developers to add and remove various buttons, toolbars and plug-ins to meet specific editing requirements and light customization.
·
3. Support multiple languages: TinyMCE can be configured to support different languages. This means users can choose to use the editor in a language they are familiar with, gaining widespread use around the world.
·
4. Easy to integrate: TinyMCE supports integration with many other platforms and applications. For example, CMS platforms like WordPress, Drupal, and Joomla all use TinyMCE as their default content editor.

shortcoming:

1. Not suitable for all projects: Although TinyMCE’s customizability and highly changeable technology can fully meet the needs of most projects, this editor is not suitable for all projects and scenarios. For example, if your application has weak editorial control over content, TinyMCE may provide too many editing controls, causing user confusion.
·
2. Complexity: Although TinyMCE is an easy-to-use editor, it is a very complex project that requires time and effort to learn and configure.
·
3. Outdated technology: Although TinyMCE has been under active development, in some aspects it may use older technology. For example, it might not adopt a new framework like React or AngularJS.

In short, TinyMCE, as a popular web rich text editor, provides extremely high flexibility and customization in many aspects, but in some cases it is not suitable for all application scenarios.

Reference article: http://blog.ncmem.com/wordpress/2023/09/17/tinymce editor imports the full version of docx and doc format word documents/
Welcome to join the group to discuss