Code writing practice supported by AI: quickly implement Nginx configuration formatting tool

This article talks about how to use GPT to quickly complete a small open source project, solve practical problems, and light up the contributor icon of the Nginx open source community on GitHub.

“Talk is Cheap, Show you the Code.”

Write in front

Organized a piece of content that should have been released last month.

Some time ago, an investor friend asked me several times how to use GPT or related tools to write code. I hope there will be a “step by step” tutorial. It just so happened that there was such an example a few days ago, so I just wrote it. article.

In fact, I have written a lot of practical content about Nginx before, and it is hard for me to say that I don’t like this highly practical open source software. When I was tossing internal services last month, I used Nginx, an old man, and NJS, which I have shared many times.

In order to verify the function faster (lazy and don’t want to write code), I opened the nginx/njs-examples of the Nginx official community on GitHub to find sample configurations.

When I opened the configuration of the official project in the code editor, what caught my eye was the content of Fangfo from the 1990s to the present:

  • The most intuitive problem is that indentation symbols are mixed, “Tab” and spaces “complement each other”, and the number is “as you like”;
  • Secondly, in different content, even if it is the expression of return with the same meaning, the way of writing in the configuration is “ever-changing”;
  • The most unbearable thing for me is that the formatting plug-in in the editor will destroy the correct configuration syntax structure. When I look at the plug-in code warehouse and its core dependent components (nginxbeautifier), I It is found that the former has archived the project (abandoned pit) because of this problem, while the latter has been slow to solve this kind of problem due to the lack of community contributors, although the project that has persisted for 7 years is very admirable.

Not only because of “obsessive-compulsive disorder” (code cleanliness), but also because I hope that Nginx configuration files are concise, beautiful, and reliable. If there is no reliable and easy-to-use Nginx formatting tool, then do it One chant.

After all, Talk is Cheap.

Open source Nginx configuration formatting tool, Nginx Formatter

The complete project, I have uploaded to soulteary/nginx-formatter, hope it can help you.

Of course, one-key three-link is also very welcome.

Scheme design

Before we start, we’d better make a simple plan and do some appropriate feasibility research for this plan.

Research on existing projects in the community

I briefly browsed through the projects related to Nginx configuration formatting in the community, including the code and historical evolution of one of the formatted open source software nginxbeautifiers that has persisted for 7 years.

I found that in the GitHub community, there are not many tools related to Nginx code formatting, but they are divided into three language camps and two ways of playing. Classified by language:

  • Python implementation (nginxfmt.py project)
  • JavaScript implementation (borrowed from nginxfmt.py)
  • Golang implementation (borrowed from nginxfmt.py)

According to the processing method, there are two types of gameplay as follows:

  • Formatting based on string features
  • Formatting based on the AST syntax tree

The first method is relatively “palliative”, and it will solve the problem faster, but it may be because the evolution of Nginx configuration is becoming more and more complex, the parsing and formatting capabilities cannot keep up with iterations, and the judgment logic is not thorough enough, resulting in formatting error.

For example, raynigon/vscode-nginx-formatter, a plug-in that has been downloaded 200,000 times in the VSCode plug-in market, adopts this solution (based on the JS version of nginx beautifier), so that some users do report that it will “damage” the configuration .

The second method is more reliable to solve the problem, but requires a complete understanding of the definition of the Nginx configuration file, and it will take some extra time to implement.

Moreover, I don’t really believe that there is a syntax analysis solution for a period of time when creating a project. Regarding the support ability of the current Nginx configuration, the richness of the current Nginx configuration is no longer comparable to that of earlier years.

So, here we first implement a solution that can solve the problem, but not so perfect.

Use AutoGPT for scheme cross-validation

Of course, before realizing it, we can use methods such as AutoGPT to disassemble or analyze the tasks we want to do or ideas, so as to “check for gaps” for us.

Use AutoGPT for cross-validation of the scheme

There are many similar tools, just find one in the community and run it with Docker. Because the results of the model have a certain degree of randomness, we can try repeatedly and adjust the “Prompt” appropriately to make the model’s answer more comprehensive. Because the “prompt spells” used in many projects are in English by default, so after execution, the results obtained are also in English.

Here we can use ChatGPT to be lazy, just copy and paste the content into ChatGPT, and then add a sentence above: “Translate the following content into Chinese”.

Use ChatGPT for content translation

Then, we wait for a while, and the content becomes easier to read in the native language.

Content translated using ChatGPT

Final scheme design

Combining the various contents mentioned above and the implementation time cost, we consider using the solution of “formatting based on string characteristics” to solve the problem.

I expect the tool to work out of the box without any dependency issues, so my base stack is Golang.

However, in the Golang ecosystem, there is no formatting tool library similar to the Python or JavaScript ecosystem, so we need to manually implement a formatting tool library, or allow the community’s Python or JavaScript code to run in our Golang program, internalized as part of our program.

Compared with the former, the latter requires less code to implement and is faster to implement, so let’s play in this way.

Practice: Ask how GPT implements basic functions

In the previous article, we mentioned the various implementations that the open source community has now, and the scheme we plan to use. In actual coding, we can use ChatGPT to complete the logic.

In order to demonstrate the lowest cost implementation, although we can use GPT-4 here, considering that most people still have usage restrictions, we use GPT 3.5 to achieve what we need.

Adjust the JavaScript version of the formatter implementation

Although the JavaScript version of the formatter has been complained by users, in fact, as long as we correct the “corner cases” in it, the program can still be used. The complete code is in soulteary/nginx-formatter/internal/formatter/beautifier.js in the project, just over two hundred lines. The overall structure is as follows:

/**
 * - Soulteary Modify the JavaScript version for golang execution, under [Apache-2.0 license], 18/04/2023:
 * - simplify the program, fix bugs, improve running speed, and allow running in golang
 * - https://github.com/soulteary/nginx-formatter
 *
 *History:
 * - Yosef Ported the JavaScript beautifier under [Apache-2.0 license], 24/08/2016
 * - https://github.com/vasilevich/nginxbeautifier
 * - Slomkowski Created a beautifier for nginx config files with Python under [Apache-2.0 license], 24/06/2016
 * - https://github.com/1connect/nginx-config-formatter (https://github.com/slomkowski/nginx-config-formatter)
 */

/**
 * Grabs text in between two seperators seperator1 thetextIwant seperator2
 * @param {string} input String to seperate
 * @param {string} seperator1 The first seperator to use
 * @param {string} seperator2 The second seperator to use
 * @return {string}
 */
function extractTextBySeperator(input, seperator1, seperator2) {<!-- -->
...
}

/**
 * Grabs text in between two seperators seperator1 thetextIwant seperator2
 * @param {string} input String to seperate
 * @param {string} seperator1 The first seperator to use
 * @param {string} seperator2 The second seperator to use
 * @return {object}
 */
function extractAllPossibleText(input, seperator1, seperator2) {<!-- -->
...
}

/**
 * @param {string} single_line the whole nginx config
 * @return {string} stripped out string without multi spaces
 */
function strip_line(single_line) {<!-- -->
...
}


/**
 * @param {string} configContents the whole nginx config
 */
function clean_lines(configContents) {<!-- -->
...
}


function join_opening_bracket(lines) {<!-- -->
...
}

function fold_empty_brackets(lines) {<!-- -->
...
}


function add_empty_line_after_nginx_directives(lines) {<!-- -->
...
}


function fixDollarVar(lines) {<!-- -->
...
}


var options = {<!-- --> INDENTATION: "\t" };

function perform_indentation(lines) {<!-- -->
...
}


function FormatNginxConf(text, indentSize = 2, indentChar = " ") {<!-- -->
...
}

During the implementation process, if you are too lazy to do anything, you can hand it over to ChatGPT, such as posting the old code before and asking it what this code means:

Use ChatGPT to interpret the code

Especially for obsolete old code (especially written by others), we can use ChatGPT to interpret the meaning and ask it to do some unit tests of the code. This can greatly shorten the time we spend reading code.

Use ChatGPT to interpret the code

Of course, in many cases, the content it generates is problematic, and we need to carefully screen or conduct additional testing and verification. But even so, it will be faster than we can do it ourselves from zero to one.

Enabling JavaScript to run in Golang

As mentioned above, because there is no tool library similar to ngxfmt or nginxbeautifier in Golang, the fastest way to meet our needs, in addition to switching the technology stack, is to run these programs in different languages directly in Golang.

Here we ask ChatGPT: “How to run JavaScript code in Golang”.

Ask ChatGPT how to run JavaScript code in Golang

It can be seen that in ChatGPT’s answer, we are recommended to use goja, and the simplest implementation is given. This project is indeed an interesting project, using the ECMA 5.1 parsing engine implemented in pure Go, which allows us to run JavaScript code directly in Golang.

Of course, in addition to goja, referring to my previous open source project soulteary/rss-can, we can also use the more powerful v8go to achieve this function, the actual execution speed Faster, but slightly larger build file size.

Let the program solve the battle with one file

As mentioned earlier, we hope that the program can “go around the world with one file”, without having to carry a bunch of messy things such as dependencies and configuration files.

We all know that Golang can be compiled into a file, but in general it can only handle the compilation and construction of Go files. So how to make JavaScript part of Golang? If you are an old reader of mine, you will definitely think of the go embed embedding solution I mentioned.

If you have not understood this technical solution, I recommend you to take a look at the Golang resource embedding solution to understand its ins and outs and the performance geometry of several solutions.

However, here we want to achieve specific functions, and the sooner the better, we might as well ask ChatGPT directly: “How to use Embed in Golang to embed a JS file.”

How to use Go Embed to combine JS and Golang into one

Use follow-up to complete the code we want

For example, in the above section, we asked how to run JavaScript code in Golang.

Combined with actual needs, we should build a Go formatting function that accepts some necessary parameters, such as: original configuration content, indentation amount, and indentation character.

Then we can add questions in specific sessions:

How to use Go Embed to combine JS and Golang into one

In general, the performance of ChatGPT is ok:

ChatGPT's answer to the question

Similar to the specific code implementation mentioned above, we will have many in the process of writing tools.

But not every generated code can come in handy, and not every code is correct. At this time, we can conduct multiple rounds of dialogue based on the generated code, so that the answer of ChatGPT can be close to us If the answer is wrong, we can let him regenerate. If you are still not satisfied with multiple regenerations, then there is a high probability that the problem is not close enough, and we need to adjust the problem appropriately.

The code generated by the optimizer

As far as the above code is concerned, although it can meet the requirements, it is too verbose to write. The code generated by default is generally straightforward and logical, and because our questions are relatively simple, they are a bit long-winded.

Therefore, we need to do some optimization for the content generated by GPT, such as the key Formatter function mentioned above (located in project location soulteary/nginx-formatter/internal/formatter/formatter.go):

package formatter

import (
"fmt"

"github.com/dop251/goja"
)

func Formatter(s string, indent int, char string) (string, error) {<!-- -->
if s == "" {<!-- -->
return "", nil
}
vm := goja. New()
v, err := vm.RunString(fmt.Sprintf("%s;FormatNginxConf(`%s`, %d, `%s`)", JS_FORMATTER, s, indent, char))
if err != nil {<!-- -->
return "", err
}
return v. String(), nil
}

Of course, you can also consult ChatGPT in the process, specific details optimization, function usage.

Actual combat: perfect program stage

After we have implemented the core functions of the program, the rest are more general corner functions or “quality assurance” related tests.

It is relatively simple to write general functions, and the task can be completed by using the following sentences:

  • Use Golang to complete the xxx function
  • Use the xx framework/toolkit in the Golang language to complete the xxx function

I won’t do much expansion here, it’s a waste of space, let’s talk about typical unit tests.

Unit testing with GPT

There should not be too many engineers interested in writing tests, especially under the premise of frequent program changes. The more tests we write, the higher the possibility of becoming waste code as the project changes.

But what if this thing can be done without our hands?

Ask ChatGPT directly for unit test code

For example, we paste the above code directly into ChatGPT and ask him to complete the unit test.

ChatGPT completed unit test

If the context is not clear enough and the function lacks Heredoc comments, it will generally generate relatively general code. If you are willing to improve the comments or provide more context, then you will get a test code with a coverage rate of 80-90%, or even complete coverage.

Actual combat: improving writing efficiency

Apart from chatting with ChatGPT and copying and pasting code stupidly, is there a more lazy way to write code?

The answer is yes, with the help of offline and online language models.

Use of native code completion model

It is an old topic to do code completion for offline models, if you are pursuing faster real-time performance and the degree of fit with code tools. As far as personal experience is concerned, I still recommend TabNine for the time being.

As for other tools, it is recommended that interested students try them out by themselves, including performance, generated results, code IDE compatibility, etc. The gap is quite obvious.

I have been using it for about three years, and the local model size is 1.2G in total. If I spend more time writing code in a week, it can at least help me save 13-30% of the output time.

#pwd
/Users/soulteary/Library/Application Support/TabNine/models

# ls
29b87067.tabninemodel b8373e4b.tabninemodel ce94127b.tabninemodel
# du -hs *
241M 29b87067.tabninemodel
685M b8373e4b.tabninemodel
256M ce94127b.tabninemodel

# du -hs .
1.2G .

However, the upper limit of TabNine depends on how much code you have seen and how good it is. Cultivating a good model is similar to feeding an electronic pet. It takes time.

Use of online code completion model

If you want to use it out of the box, and the code is not so sensitive, then online code completion will be more suitable for you.

Copilot, an easy-to-use code completion tool

The only recommendation here is: github/copilot, if your network is smooth, generally your code completion can be completed within 1s~2s.

You may have to pay a small fee to subscribe to this feature by default. Fortunately, my account is eligible to use it directly.

Free Unlimited Copilot

There should be a lot of introductions to Copilot on the Internet. Here I will introduce two tips for actual use.

Generate based on context (opened file)

In actual programming, we will open many different files, but if the code we want to generate is only related to one or a few files, we can consider closing other files.

Generate based on context (including clipboard)

If we want to generate code for a specific piece of content, before generating it, we can copy and paste the content we want to use as the context for code generation. When generating code, it can save some things.

Actual combat: finishing work

After the coding work is done, we still need to do some finishing touches.

For example, writing bilingual project documents in Chinese and English, and designing the logo of the project.

Documentation for open source projects using GPT

Here is the same as using AutoGPT in the previous article, we can submit content multiple times and let ChatGPT help us write the framework of the project.

Let ChatGPT generate project document template

Then we can replace the content in the document according to the actual situation.

As for the English document, you only need to do the same as “Translate Auto GPT content into Chinese” above, and in turn, let ChatGPT translate the content into English.

Is it simple and easy.

Using MidJourney to complete the project icon

One of the hardest parts of writing a project is designing a logo for the project. But now with SD and Midjoruney, it’s too easy.

We just need to give it an order: “Help me design a Logo, the content of the Logo is…”

Use Midjoruney to generate project logo

Of course, in actual use, if we rewrite the prompt to English, the generated effect will be better for the model.

Translate the prompt content into English in advance

If you often use graphic models such as Midjoruney, you can try to use another open source project of mine that has been on the GitHub hot list for a long time before: “Eighty Lines of Code to Realize Open Source Midjourney, Stable Diffusion “Spell” Drawing Tool” .

Other

Well, the basic content of the article is over here.

Let’s talk about interesting things in the open source community.

Interesting things in the open source community

In fact, last year, in the Nginx community, a foreigner left an issue, which included several modifications to remove redundant spaces in the configuration.

PR in the community

I saw this commit at the time and thought it was a water PR because there was no consistent standard or reproducible tool provided. So, I left a comment “This change seems unnecessary, perhaps providing a general formatting tool is more valuable for developers.”

However, neither the change submitter nor the project-related maintainers continued to reply. As a result, this issue has been hanging for a year. Just take this opportunity, let’s use ChatGPT to solve this problem.

Completely solve the submission of Nginx community project

At present, I have used this small tool to complete the “content renovation” in the official Nginx configuration repository, and to light up the contributor icon record in the Nginx open source community.

Illuminated Nginx project contributor icon

Last

I finally sorted out this article that has been in the draft box for a month. I hope that in the future, with the normal development of the business, I will have more time to share how to “toss in order not to toss”.

–EOF

We have a small tossing group, which gathers some friends who like tossing.

In the absence of advertisements, we will chat about software and hardware, HomeLab, and programming issues together, and will also share some technical information in the group from time to time.

Friends who like tossing, welcome to read the following content, scan the code to add friends.

Some suggestions and opinions about “making friends”

When adding a friend, please note the real name and company or school, and indicate the source and purpose, otherwise it will not pass the review.

Those things about tossing the group into the group

This article uses the “Signature 4.0 International (CC BY 4.0)” license agreement. You are welcome to reprint or re-use it, but you need to indicate the source. Attribution 4.0 International (CC BY 4.0)

Author of this article: Su Yang

Created: May 20, 2023
Statistical word count: 10491 words
Reading time: 21 minutes to read
Link to this article: https://soulteary.com/2023/05/20/code-writing-practice-supported-by-ai-quickly-implement-nginx-configuration-formatting-tool.html