How does the front end prevent data from being abnormally tampered with and restore the data?

Every day, we deal with various documents, such as PRD, technical solutions, personal notes, etc.

In fact, there is a lot of knowledge in document formatting. Just like me, I have obsessive-compulsive disorder about formatting, and I can’t bear to not add spaces between English and Chinese.

Therefore, I have recently been working on a Google extension plug-in called chrome-extension-text-formatting. Through Google extension, I can quickly format the selected text into text that conforms to the Chinese Copywriting and Typesetting Guidelines.

emmm, what are typography guidelines? Simply put, its purpose is tounify the relevant usage of Chinese copywriting and typesetting, reduce communication costs between team members, and enhance the elegance of the website.

for example:

Need to add spaces between Chinese and English

correct:

On LeanCloud, data storage is around AVObject.

mistake:

On LeanCloud, data storage is carried out around AVObject.

On LeanCloud, data storage is around AVObject.

Complete correct usage:

On LeanCloud, data storage is around AVObject. Each AVObject contains JSON-compatible key-value pairs. The data is schema-free. You don’t need to specify in advance which keys exist on each AVObject. You just need to set the corresponding key-value directly.

Exception: Product nouns such as “Douban FM” are written in the officially defined format.

Spaces need to be added between Chinese characters and numbers

correct:

I spent 5,000 yuan shopping for groceries today.

mistake:

I spent 5,000 yuan shopping for groceries today.

I spent 5,000 yuan shopping for groceries today.

Of course, the entire typesetting specification is not limited to this, and the above is just a brief list of part of the specification content. Moreover, this thing is a suggestion and it is difficult to force it to be promoted. Therefore, I wanted to implement such a Google plug-in extension to format selected text with one click.

Take a look at the diagram:

Suitable for various text editing boxes, of course Excel can also be used:

Of course, this is not the focus of this article.

Exception scenarios encountered when compatible with Yuque documents

Because there are certain differences between various document platforms, the extended production process needs to be compatible with different document platforms (of course, more commonly used are some of my own document platforms, such as Google Docs, Yuque, and Youdao Cloud) , Github, etc.).

Overall, the function of the entire extension is very simple. A minimalist process is as follows:

It should be noted that most of the above operations are performed based on JavaScript script files inserted into the page.

When compatibility with Yuque Document, I encountered such an interesting scene.

After step 4 above is completed,when we perform any operation on the replaced text, such as re-focusing, re-editing, etc., the modified text will be replaced and restored to the state before modification. !

What does that mean? Take a look at the actual screenshot below:

To sum up, what does this operation of Yuque mean?

After the script manually replaces the original selected file, when the text is focused again, the modified content will be restored.

After some testing, I clarified the logic of the Yuque document:

  1. If the user inputs content normally, types the content through the keyboard, or copies and pastes normally, the document can be modified normally and saved;

  2. If the document content is modified by script insertion or replacement, or the document content is modified by manually modifying the DOM through the console, the document content will be restored;

  3. After using a script to make any modifications to the content, even if you do not perform any operation and click the save button directly, the document will still be restored to the version before the operation;

Oh, this function is indeed very interesting. Its power lies in thatit can identify whether the modification of content is a routine normal operation or an unconventional operation such as script or console modification. And after abnormal operation, fall back to the last normal operation version.

So, how does Yuque do this?

Since the obfuscated code compiled online is more difficult to debug with breakpoints, we can make a bold guess as to where we might start if we need to implement a similar function.

MutationObserver implements document content stack storage

First, we definitely need to use MutationObserver.

MutationObserver is a JavaScript API for monitoring DOM changes. It provides the ability to asynchronously observe the DOM tree and trigger callback functions when changes occur.

Let’s build a minimal scenario of an online document:

<div id="g-container" contenteditable>
    This is a section of the Web cloud document. If you edit it directly, you can edit it successfully. If modified using the console, the data will be restored.
</div>
#g-container {
    width: 400px;
    padding: 20px;
    line-height: 2;
    border: 2px dashed #999;
}

Here, we use the contenteditable attribute of HTML to implement an editable DIV box:

Next, we can use MutationObserver to monitor this DOM element. Whenever the content of this element changes, the event callback of MutationObserver is triggered, and the result of each element change is recorded through an array.

The approximate code is as follows:

const targetElement = document.getElementById("g-container");
//Record initial data
let cacheInitData = '';

function observeElementChanges(element) {
    const changes = []; // Array to store changes
    const targetElementCache = element.innerText;

    //Cache the initial data each time
    cacheInitData = targetElementCache;
    
    //Create MutationObserver instance
    const observer = new MutationObserver((mutationsList, observer) => {
        // Check if focus currently exists
        mutationsList.forEach((mutation) => {
            console.log('observer', observer);
            const { type, target, addedNodes, removedNodes } = mutation;
            let realtimeText = "";
            
            const change = {
                type,
                target,
                addedNodes: [...addedNodes],
                removedNodes: [...removedNodes],
                realtimeText,
            };
            
            changes.push(change);
        });
        
        console.log("changes", changes);
    });

    //Configure MutationObserver
    const config = { childList: true, subtree: true, characterData: true };

    // Start observing changes in elements
    observer.observe(element, config);
}

observeElementChanges(targetElement);

The above code takes a little time to read. But its essence is very easy to understand. I will roughly list its core steps:

  1. Create a MutationObserver instance to observe changes in the specified DOM element

  2. Define a configuration object config for specifying options for observation. In this example, the configuration object is set

    childList: true means observing changes in child nodes

    subtree: true means observing changes in all descendant nodes

    characterData: true means observing changes in the text content of the node

  3. Store changed information in the changes array

  4. Each element in the changes array records information about a DOM change. Each change object contains the following properties:

    type: represents the type of change, which can be "attributes" (attribute changes), "characterData" (text content changes) or "childList" (child node changes).

    target: Indicates the target element that has changed.

    addedNodes: An array containing new nodes, representing the nodes added in the change.

    removedNodes: An array containing removed nodes, representing the nodes removed in the change.

    realtimeText: Real-time text content, which can be set according to specific needs.

In this way, we try to edit the DOM element, open the console, and see what is output for each change:

It can be found that every time the content in the DIV is updated, a MutationObserver callback is triggered.

Let’s expand two places in the array to explain in detail:

Among them, type indicates which type of change in the config configured by MutationObserver is triggered this time, and hits characterData, which is the change in the text content mentioned above. Both addedNodes and removeDNodes are empty, indicating that there is no structural change.

The only change between the two sets of data is that in realtimeText we use this value to record the text value content within the editable DOM element.

  • The first time a period was removed. , so the realtimeText text has one less period than the initial text.

  • The second operation deletes a duplicate word, so the realtimeText text is less duplicate than the initial text.

The following data are deduced in turn. It can be seen that with this information, in fact, we are equivalent to being able to implement the entire DOM operation stack of the structure!

On this basis, we can first push the data that has not undergone any operation in the changes array before the entire listening. This also means that we have the ability to restore data to any step in the user’s operation process.

Use feature status to identify whether the user input manually

With the above changes array, we are equivalent to having stack information for each step of the user’s operation.

The next core is how we should use them.

In the case of Yuque, the core point is:

It can identify whether content modifications are routine normal operations or irregular operations such as scripts and console modifications. And after abnormal operation, fall back to the last normal operation version.

Therefore, the question we explore next becomes how to identify an input edit box and whether its content modification is a normal input modification or an abnormal input modification.

For example, think about what characteristic information should appear when a user inputs normally or copies and pastes content into the edit box:

  1. You can get the focused element of the current page through document.activeElement. Therefore, every time a mutation change is triggered, an additional copy of the current focused element information can be stored to compare it with the page when the content is modified. Whether the focused element is the current input box

  2. Try to determine the focus state of the input box. You can judge by monitoring the focus and loss events of foucs and blur.

  3. When the text content changes, does the user trigger a keyboard event, such as the keydown event?

  4. When the text content changes, does the user have a paste paste event that triggers a keyboard event?

  5. For direct modification of the console, it may be that in addition to the text content, there are other changes in the DOM subtree, that is, the childList change event of Mutation will be triggered.

With the above ideas, let’s try it out. In order to make the DEMO as easy to understand as possible, we slightly simplify the requirements and implement:

  1. An input box, the user can change the content by normal input

  2. When the content of the input box is modified through the console, when the element is focused again, the most recent manual modification record will be restored.

  3. If (2) the latest manual modification record cannot be found, restore the data to the initial state

Based on this, I give the rough pseudo code below:

<div id="g-container" contenteditable>This is a piece of content in a Web cloud document. If you edit it directly, you can edit it successfully. If modified using the console, the data will be restored. </div>
const targetElement = document.getElementById("g-container");
//Record initial data
let cacheInitData = '';
//Data reset flag bit
let data_fixed_flag = false;
//Reset cache object
let cacheObservingObject = null;
let cacheContainer = null;
let cacheData = '';

function eventBind() {
    targetElement.addEventListener('focus', (e) => {
        if (data_fixed_flag) {
            cacheContainer.innerText = cacheData;
            cacheObservingObject.disconnect();
            observeElementChanges(targetElement);
            
            data_fixed_flag = false;
        }
    });
}

function observeElementChanges(element) {
    const changes = []; // Array to store changes
    const targetElementCache = element.innerText;

    //Cache the initial data each time
    cacheInitData = targetElementCache;
    
    //Create MutationObserver instance
    const observer = new MutationObserver((mutationsList, observer) => {
        mutationsList.forEach((mutation) => {
            // console.log('observer', observer);
            const { type, target, addedNodes, removedNodes } = mutation;
            let realtimeText = "";
            
            if (type === "characterData") {
                realtimeText = target.data;
            }
            
            const change = {
                type,
                target,
                addedNodes: [...addedNodes],
                removedNodes: [...removedNodes],
                realtimeText,
                activeElement: document.activeElement
            };
            changes.push(change);
        });
        
        let isFixed = false;
        let container = null;
        
        for (let i = changes.length - 1; i >= 0; i--) {
            const item = changes[i];
            // console.log('i', i);
            if (item.activeElement === element) {
                if (isFixed) {
                    cacheData = item.realtimeText;
                }
                break;
            } else {
                if (!isFixed) {
                    isFixed = true;
                    container = item.target.nodeType === 3 ? item.target.parentElement : item.target;
                    cacheContainer = container;
                    data_fixed_flag = true;
                }
            }
        }
        
        if (data_fixed_flag & amp; & amp; cacheData === '') {
            cacheData = cacheInitData;
        }
        
        cacheObservingObject = observer;
    });

    //Configure MutationObserver
    const config = { childList: true, subtree: true, characterData: true };

    // Start observing changes in elements
    observer.observe(element, config);
    eventBind();
    
    // Return a function that stops observing and returns the changes array
    return () => {
        observer.disconnect();
        return changes;
    };
}

observeElementChanges(targetElement);

A brief explanation, the general process is as follows

  1. observeElementChanges has appeared above. The core is to record every change of DOM elements and record the changes in the changes array.

    An extra activeElement is recorded, indicating that every time the DOM element changes, the focus element of the page

  2. Each time changes is updated, traverse the changes array in reverse order.

    If the focused element on the current page is not the same element as the currently changed DOM element, it is considered an illegal modification and two flag bits isFixed and data_fixed_flag are recorded. At this time Continue to search for the latest normal modification record

    isFixed is used to search forward for the latest normal modification record and save the stack information of the latest modification.

  3. The data_fixed_flag flag is used to determine whether the data needs to be rolled back when the element is focused again (the focus event is triggered).

OK, at this point, let’s take a look at the overall effect:

In this way, we successfully identified abnormal operations and restored to the last normal data.

Of course, the actual scene is definitely more complicated than this, and more details need to be considered. For the sake of overall understandability, the expression of the entire DEMO is simplified here.

For the complete DEMO effect, you can click here to experience it: [CodePen Demo — Editable Text Fixed]

Some thoughts

As for what is the use of this function? This is a matter of opinion. At least for me, who develops extension plug-ins, it is a very thorny issue. Of course, from Yuque’s perspective, it may be more of a security consideration.

Of course, we should not be limited to this scenario. Think about it, this solution can actually be applied to many other scenarios, for example:

  1. Front-end page watermark enables immediate watermark restoration when the style, structure, or content of the watermark DOM is tampered with.

Of course, there are some ways to crack it. For extension plug-ins, I can inject my content script into the page earlier and hijack the global MutationObserver object before the page is loaded and rendered.

All in all, you can try more interesting front-end interaction restrictions through the ideas provided in this article.

Article reprinted from: ChokCoco

Original link: https://www.cnblogs.com/coco1s/p/17816734.html