ChatGPT dances at your fingertips: open-interpreter realizes one-stop local data collection and processing

For more details, please click to view the original text ChatGPT dancing at your fingertips: open-interpreter realizes one-stop local data collection and processing

Python teaching column aims to provide beginners with a systematic and comprehensive Python programming learning experience. Through step-by-step explanation of the basic language and programming logic of Python, combined with practical cases, novices can easily understand Python!

>>Click here to view the previous Python teaching content”>>>>Click here to view the previous Python teaching content

Table of contents

1. Foreword


2. Introduction to Open – Interpreter


3. Installation and operation


4. Work Scene


(1) Obtaining web content

(2) Batch conversion of pdf files

(3) Excel file merging


5. Summary

This article has a total of 4192 words and takes about 11 minutes to read. Corrections are welcome!

Part1 Preface

This issue introduces an open source projectopen-interpreter published by KillianLucas on Github, which allows AI large language models (LLMs) to run code(Python, Javascript, Shell, etc.) on local computers, and the previous The method of running code by calling ChatGPT in the article has the same purpose. (Portal: Python Practical Combat | ChatGPT + Python realizes fully automatic data processing/visualization)

Of course, in comparison, open-interpreter is more powerful and complete, and can handle a variety of tasks more flexibly. It is currently on the Github hot list and has received 17k+ stars. This article will introduce the usage of open-interpreter and give some application examples.

Part2 Open – Introduction to Interpreter

In a nutshell, Open-Interpreter is an AI tool deployed on a local computer that can help you complete local computer operations and call the local network and programming environment to help you collect, operate, and process local data.

In fact, OpenAI has also released a code interpreter that uses the GPT-4 model and works in a sandbox and firewall execution environment. The code interpreter released by OpenAI supports uploading and downloading files, but has a file size limit of 100M. In addition, for security reasons, OpenAI has set strict restrictions for this interpreter, so that it cannot access the network and can only use limited third-party libraries[1].

Code interpreter released by OpenAI

Compared with the interpreters released by OpenAI, the open-interpreter interpreter has the following advantages:

  • Supports networking and can access the network through Python third-party libraries
  • Local access, no limits on file size and operation time
  • All libraries can be used, and GPT will include the code to install the libraries in the code given.
  • Supports GPT-4 and ChatGPT-3.5-Turbo. Even if there is no API, you can also change the model to the open source Code LLaMa

Part3 Installation and Operation

open-interpreter supports running in both the Python development environment and the local terminal (You need to ensure that the local programming language has been deployed), but the publisher KillianLucas prefers to run in the terminal. In the results of this article During the test, the author did find that it was more convenient to run it in the terminal. No matter which operating mode is used, the installation method is the same:

1 pip install open-interpreter

After the installation is complete, if you want to run it in the terminal, there are three ways to start it:

  • Enabled by default – use GPT-4 model: interpreter
  • Fast start – use GPT-3.5-Turbo model: interpreter --fast
  • Enable local – use local model (free): interpreter --local

After entering the open command in the terminal and pressing Enter, you will be prompted to specify the OpenAI API Key. Enter the Key and press Enter to run open-interpreter:

open-interpreter terminal running interface (using GPT-3.5-Turbo model)

By the way, if you don’t want to re-enter the OpenAi API Key every time you use it, you can store the Key in an environment variable, so that it will be automatically imported from the environment variable every time you run it. Just search for environment variables in computer settings and create a new environment variable named “OPENAI_API_KEY”:

You can also enter setx OPANAI_API_KEY YOUROPANAIAPIKEY in the terminal to store the Key in the environment variable. In the Python development environment, open-interpreter needs to be run by importing the library:

1 import interpreter

If you want to use another model, you need to specify it with the following code, otherwise the GPT-4 model will be used by default:

1 interpreter.model = "gpt-3.5-turbo"

To call open-interpreter in a development environment, you need to use the function interpreter.chat(). If you do not specify the content, the interactive chat will be started just like the terminal operation. If you want more precise control, you can also specify the specific question content in the function:

1 # Interactive chat
2 interpreter.chat()
3
4 # Precise control
5 interpreter.chat("your question content")

When open-interpreter gives the execution plan and code of the task, you need to enter y to confirm acceptance of the given plan or code. If you are not satisfied with the given answer, you can enter n, and re-give the request to let the AI improve the answer until it is satisfied. Next, several applications in work scenarios will be used to demonstrate the powerful functions of open-interpreter.

Part4 Work Scenario

1 Get web content

The most striking feature of open-interpreter is that it supports networking. We first let it try to read and understand the content of the web page. We simply let it read the Github URL of the open-interpreter project and give a simple “self-introduction”.

First, we use the open command to run open-interpreter (the GPT-4 model is used here), and then ask it “What is the main content of this Github project? https://github.com/KillianLucas/open-interpreter”, Afterwards, AI gave the corresponding solution steps and Python code:

Visit web solutions

We type y to accept this solution, and open-interpreter will run the code and give the result:

Visit web results

open-interpreter successfully read the content of the web page and gave a summary of the information.

Next, we try to let open-interpreter complete a simple crawler task. We would like to obtain the “2022 Zhejiang Province Science and Technology Leading Enterprises Recognition List” and “2022 Zhejiang Province Science and Technology Little Giant Enterprises Recognition List” from the notice issued by the Zhejiang Provincial Department of Science and Technology. The web address of the notice is “https:// kjt.zj.gov.cn/art/2023/1/13/art_1229225203_5055092.html”, the original web page content is as follows:

Original web page information

We issue instructions to open-interpreter to obtain these contents, and the solution given by AI is:

Get form scheme from web page

As you can see, AI first gives the code to install the four libraries request, beautifulsoup4, pandas and openpyxl. Since These four libraries have been installed before, so type n and let it modify the plan. The modification result of AI is as shown in the figure:

Correct the plan given by AI as required

Then repeat the above operations, choose whether to accept the AI code according to your own needs, and let it be improved until you are satisfied. Finally, open-interpreter completed its task and stored the table in the required directory:

Storage of web form retrieval results

The final excel content stored by AI is shown in the figure below. It can be seen that during the operation, the AI correctly obtained the information we needed and did not include irrelevant information. The task was completed very successfully:

Excel table content obtained by open-interpreter

2 pdf file batch conversion

During data processing, we often encounter tables stored in pdf format. These tables can be stored in excel format using Python. Now that there are four pdf files in the folder, we issue a command to open-interpreter to extract the tables and save them in excel format.

pdf batch conversion solution

Similarly, AI will give the solution steps, and we will continue to adjust according to our own needs. Finally, AI will complete the batch conversion of pdf and save the converted excel files in the same folder:

pdf conversion completed

The converted excel table

3 excel file merging

When processing data, we often encounter such a situation: due to database export restrictions or other reasons, a complete data set is split into multiple small data sets and stored separately, and we need to merge the small data sets during data analysis. . Such tasks can also be easily accomplished using open-interpreter.

For the example data in this section, we use the “Statistical Table of Green Credit Status of 21 Major Banks” in the “China Public Policy and Green Development Database” of the Enterprise Research and Social Science Big Data Platform (Website: https://r.qiyandata.com/ ). There are five excel tables in the folder, and their fields are all the same. Now we issue instructions to open-interpreter to merge the five tables into one large excel table:

The “Statistical Table of Green Credit Status of 21 Major Banks” is located in the “Green Finance”-“Green Credit” module under CPPGD.

The China Public Policy and Green Development Database (“CPPGD” for short) was jointly launched by Enterprise Research Data, China Rural Development Research Institute of Zhejiang University and School of Economics of Zhejiang Gongshang University to help the country focus on “carbon peaking and carbon neutrality”. A series of major strategic deployments made towards the goal, and a special database created to serve academic and policy research in China’s green development and related fields.

For more data-related information, please see the original article!

excel merge plan

Finally, open-interpreter successfully merged five excel tables into a master table named “merged.xlsx”:

excel merge results

The final merged table has 25 rows and 11 fields:

Merged excel data

Part5 Summary

With the development of large LLM models, AI can be used in a wider range of applications, and various AI tools are emerging in endlessly. The open-interpreter introduced in this article solves the problem that the GPT model cannot be connected to the Internet to a certain extent. The local running feature allows it to operate local files, and the code confirmation function ensures security issues. It is a good LLM expansion application. Of course, due to space issues, this article does not fully demonstrate all the functions of open-interpreter. Interested readers can refer to the Colab notes[2] posted by the author Killian Lucas on the Github project page, or install it yourself explore.

References

[1] Limited third-party libraries: https://wfhbrian.com/mastering-chatgpts-code-interpreter-list-of-python-packages/

[2]Colab notes: https://qiyandata.feishu.cn/wiki/DLmNwx0wBisEbVkz2KbcPsr9nVf?comment_id=7278950529178402820 & amp;comment_anchor=true

Past recommendations

Python practice | Extracting indicators from text using regular expressions

Python teaching | Understand “classes and instances” in object-oriented in one article

Data Visualization | That’s right! Plotting like ggplot2 in Python

Python practice | ChatGPT + Python realizes fully automatic data processing/visualization

Python practice | How to use Python to call API