fix: prosemirror adds two extra spaces when paste

bug

The project uses prosemirror, and when copying NodeSelection, there will be two more spaces at the end.

NodeSelection

Selection of prosemirror is an abstract class, which has three subclasses

  • TextSelection most common
  • NodeSelection points to a selection of a single node. For nodes with selectable = true set, click to select NodeSelection.
  • AllSelection Selects the entire document

For nodes with selectable = true set, it is NodeSelection when clicked and TextSelection when dragged. Let’s use prosemirror’s dino example to illustrate:

Click on the red dinosaur

Check the dom on the console, it is selectednode

At this time the selection is NodeSelection

Drag to select the red dinosaur

View dom, not selectednode

At this time the selection is TextSelection

debug

Location event handler


Console – Elements – Select prosemirror container – Event Listeners – copy & paste, find the input.ts file of prosemirror-view:

handlers.copy = editHandlers.cut = (view, _event) => {<!-- -->
  let event = _event as ClipboardEvent
  let sel = view.state.selection, cut = event.type == "cut"
  if (sel.empty) return

  // IE and Edge's clipboard interface is completely broken
  let data = brokenClipboardAPI ? null : event.clipboardData
  let slice = sel.content(), {<!-- -->dom, text} = serializeForClipboard(view, slice)
  if (data) {<!-- -->
    event.preventDefault()
    data.clearData()
    data.setData("text/html", dom.innerHTML)
    data.setData("text/plain", text)
  } else {<!-- -->
    captureCopy(view, dom)
  }
  if (cut) view.dispatch(view.state.tr.deleteSelection().scrollIntoView().setMeta("uiEvent", "cut"))
}

editHandlers.paste = (view, _event) => {<!-- -->
  let event = _event as ClipboardEvent
  // Handling paste from JavaScript during composition is very poor
  // handled by browsers, so as a dodgy but preferable kludge, we just
  // let the browser do its native thing there, except on Android,
  // where the editor is almost always composing.
  if (view.composing & amp; & amp; !browser.android) return
  let data = brokenClipboardAPI ? null : event.clipboardData
  let plain = view.input.shiftKey & amp; & amp; view.input.lastKeyCode != 45
  if (data & amp; & amp; doPaste(view, data.getData("text/plain"), data.getData("text/html"), plain, event))
    event.preventDefault()
  else
    capturePaste(view, event)
}

Copy

Simplify the code:

handlers.copy = editHandlers.cut = (view, e) => {<!-- -->
  if (empty selection) {<!-- --> return }

  const {<!-- -->dom, text} = serializeForClipboard(view, view.state.selection.content())
  event.preventDefault()
  e.clipboardData.clearData()
  e.clipboardData.setData("text/html", dom.innerHTML)
  e.clipboardData.setData("text/plain", text)

  if (is cut) {<!-- --> delete selection }
}

The core operation is to obtain dom and text through serializeForClipboard, and then set them to clipboardData.

Click on the dino node and copy it (this is NodeSelection). Check the break point and see that the html entered by set is:

"<img dino-type="stegosaurus" src="/img/dino/stegosaurus.png" title="stegosaurus" class="dinosaur" data-pm-slice="0 0 []">"

The text entered by set is an empty string.

No problem till this point

Paste

Next, look at the paste and simplify the code:

editHandlers.paste = (view, event) => {<!-- -->
  // ctrl + v and shift + insert paste; ctrl + shift + v pastes as plain text
  let plain = view.input.shiftKey & amp; & amp; view.input.lastKeyCode != 45 // Paste as plain text flag
  doPaste(view, event.clipboardData.getData("text/plain"), event.clipboardData.getData("text/html"), plain, event)
  event.preventDefault()
}

The core operation is to take out the data in clipboardData and call doPaste

Paste. View at break point:


The text data is an empty string, no problem. The html data has changed, and there are multiple sets of html > body > startFragment .

Check doPaste -> parseFromClipboard step by step and find that the return value of parser.parseSlice is incorrect. Its return value is [dino node, TextNode

Continue to look at the internal implementation of parseSlice and find that it uses childNodes to obtain child nodes and obtains [text, comment, img, comment, text] (if children are used, only one child node img will be obtained). The text content from beginning to end is “\
\
“, and the two comment nodes are startFragment and EndFragment respectively.

Loop processing of child nodes: \
\
at the beginning is ignored, startFragment is ignored, img is retained, endFragment is ignored, and the \
\
at the end is retained

Change to TextSelection and copy

Compare TextSelection. Copy, break point to view, and set the html data in: (compared to the NodeSelection just now, there is an extra layer of p outside)

<p data-pm-slice="1 1 []"><img dino-type="stegosaurus" src="/img/dino/stegosaurus.png" title="stegosaurus" class="dinosaur"></p>

text or empty string

TextSelection, paste

The obtained html data is:

<html>
<body>
<!--StartFragment--><p data-pm-slice="1 1 []"><img dino-type="stegosaurus" src="/img/dino/stegosaurus.png" title="stegosaurus" class ="dinosaur"></p><!--EndFragment-->
</body>
</html>

doPaste -> parseFromClipboard -> parser.parseSlice, childNodes is [text, comment, p>img, comment, text]. The child nodes are processed in a loop. The difference from NodeSelection is that when the last item \
\
is processed, this item is ignored (after a brief look, it is probably the loop to the last item, and the generated node is [p >img], it is judged that the root node p of the last item is a block-level element, and the current \
\
is ignored)

Summary

  1. Set data to e.clipboardData when copying. NodeSelection is set to img, TextSelection is set to p>img
  2. Get data from e.clipboardData when pasting. The data obtained by NodeSelection includes multiple sets of html>body>startFragment, and there are also multiple sets of TextSelection.
  3. When pasting, the parseSlice method internally obtains child nodes through childNodes. NodeSelection gets [text, comment, img, comment, text], TextSelection is [text, comment, p>img, comment, text]. The comments have been removed, and the \
    \
    at the beginning has been removed, but the \
    \
    at the end is retained in NodeSelection and removed from Text Selection.

By looking at the code, I can no longer debug it and cannot change it. I don’t understand why multiple sets of html>body are needed, which results in two extra spaces at the end. I’m speechless.

discussion

I went to google again and saw Space added on paste . Same problem as I encountered. Someone replied:

It was actually a good thing that windows did, I didn’t expect it at all. I always thought it was added by something internally processed by prosemirror. .

Windows HTML Clipboard Format HTML: The fragment should be preceded and followed by the HTML comments
and
to indicate where the fragment starts and ends

The solution is also posted above. Just remove these things added by windows in transformPastedHTML and it will be ok.