?Use ChatGPT to control Vim and make new discoveries!

AIGC tool ChatGPT meets the old text editor Vim, what kind of sparks will the two create? The author of this article, Lachlan Gray, gave it a try and got an unexpected surprise from automation. Let’s take a look.

Original text: https://lachlan-gray.com/Controlling + Vim + With + ChatGPT

Do not redistribute without permission.

Author | Lachlan Gray

Translation Tools | ChatGPT Editor | Su Mi

Produced | CSDN (ID: CSDNnews)

The following is the translation:

Vim is a text-based text editor. As a user, every piece of information you can get from Vim is in the form of text, and every piece of information you send to it is also text. Similarly, we also mainly use the popular ChatGPT through text messages. For these two tools, they are used in the form of text, so can we remove the intermediate links, for example, directly use ChatGPT to control Vim?

The code is available here: https://github.com/LachlanGray/vim-agent

A match made in heaven

Vim is unique in that it’s designed so that you know how to use the editor, which means you know how to use the editor to automate. In short, every interaction you have with the editor can be converted into a VimScript program. Any sequence of operations involving any combination of typing, changing, deleting, editing commands, shell commands, opening files, etc. can be completed by a program. Whether this benefits humans is debatable, but if you combine it with a language model, the functionality is priceless.

VimScript has been around for a long time, and the internet is flooded with VimScripts for doing all kinds of tedious tasks. VimScript is basically a domain-specific programming language designed for manipulating text. It’s received some criticism for its age and rigidity, but it does an excellent job at what it’s supposed to do. For us, this means that ChatGPT is going to work really well with it and theoretically be able to do everything I can do with an editor.

Neovim API

How will ChatGPT actually control the editor? Conveniently, Neovim has a Python client (https://pynvim.readthedocs.io/en/latest/) that allows you to control the editor through the Python API. This is very convenient.

First, you can start Neovim and set a listening address by:

1nvim --listen 127.0.0.1:7777

This will start a regular Neovim instance, but it will listen for connections. Now, through the Python API, we can tell Neovim to perform actions. The Python API connects to Neovim using TCP (Transmission Control Protocol), a network protocol designed to ensure that data is received in its entirety and arrives in the order it was sent. We create a vim object to manage Neovim’s state:

1vim = attach('tcp', address='127.0.0.1', port=7777)

This vim object gives us full access to Neovim, including programmatic access to the editor’s state and functionality. Among many convenience wrappers, it provides vim.request(), a catch-all tool that can be used with everything in the API documentation (https://neovim.io/doc/user/api.html). It allows us to do almost anything, such as running a command:

1vim.request('nvim_command', "q!")

Here’s our plan: encapsulate all the features we care about and simplify them so that we can easily interact with the language model. We will call this wrapper VimInstance, it will contain the vim object and have simple properties to access information, as well as methods to control the Vim instance:

1class VimInstance:
 2 def __init__(self):
 3 self.vim = attach('tcp', address='127.0.0.1', port=7777)
 4
 5 @property
 6 def current_buffer_content(self):
 7 return self.vim.request(
 8 'nvim_buf_get_lines',
 9 self.vim.current.buffer,
10 0, -1, True)
11
12 #...
13
14 def input(self, keys):
15 self.vim.input(keys)
16
17 # ...

Then we can do something like:

1vim = VimInstance()
2
3# list containing each line of current file
4text = vim.current_buffer_content
5
6# type "hello" at the start of the file and save
7vim.input("ggIhello<Esc>:w<CR>")

Give me the code!

One of the great certainties in life is that ChatGPT will tell you what it is doing. And usually, it will say something even if you don’t ask for it. We’re definitely going to run into this problem right away. Rather than coaxing the model, it might be easier to just wait for the code to appear, and then hang up the connection when the code completes.

The streaming capabilities of OpenAI’s chat completion endpoint are useful for this. It allows us to monitor blocks of text as they arrive and decide in real time what to do with them. We can iterate over the arriving completion results by producing a block of text like this:

1import openai
 2
 3def chat_3(messages: list[dict]):
 4 completion = openai.ChatCompletion.create(
 5 model = "gpt-3.5-turbo",
 6 messages = messages,
 7 temperature = 0.9,
 8 stream=True
 9    )
10 for chunk in completion:
11 if "content" in chunk.choices[0].delta:
12 yield chunk.choices[0].delta["content"]

Now we can handle the model’s response without waiting for the model to complete:

1def filtered_chat(request: str):
2 messages = [{"role":"user"}, {"content": request}]
3 for chunk in chat_3(messages):
4 if <condition>:
5 yield chunk

To collect the code, we can monitor when the response scroll arrives and wait for the beginning code block pattern (“`.*?(\\
.*))) to appear. Once the match is successful, we know it’s time to start listening. We can then continue listening until we encounter more backticks. If all goes well, we should only have code.

GPT takes over control

Now we have everything we need to plug ChatGPT into our editor. For basic setup, let’s tell ChatGPT what to do. A basic prompting strategy is as follows:

Gives an instruction and provides screen content as context.
Explicitly require a VimScript for the task.
Execute each line in Neovim.

Below are the results. On the left is Neovim opening a file, on the right I’m typing commands. It…kind of works:

It performed reasonably well for basic operations, like deleting and rearranging content on the screen, but was less reliable when it came to handling more advanced requests, like converting comments to pig Latin (a codeword for children). The good news is that there are millions of ways to improve it, so the future of automatic text editing looks promising! The next step is to develop a better prompting strategy rather than this one-size-fits-all approach to scripting. Once we have a good set of basic operations, I want to build an agent similar to “Voyager” (https://github.com/MineDojo/Voyager) and see how far it goes.

Recommended reading:

?In the era of large models, a developer’s guide to growth | New programmers

?Nine questions to the leader of China’s large-scale model, a 10,000-word long article explaining the progress trend of large-scale models in detail

?Two kinds of technical chaos: Why is it easier for programmers to take the blame in the former?

The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge. CS entry skill treeLinux introductionFirst introduction to Linux 38083 people are learning the system