ChatGPT starts networking, the last seal is lifted

Click on “3D Vision Workshop” above and select “Star”

Dry goods delivered in the first time

c08b4c50ca16a116e81977168bf551a7.jpeg

Edit 丨 Heart of the Machine

Click to enter->3D Vision Workshop Learning Exchange Group

Nothing can stop ChatGPT now?

ChatGPT is a “super-high IQ” artificial intelligence, especially after an upgraded version of the GPT-4 kernel. However, we know that its training data is up to 2021, and it is impossible for AI to answer it accurately without training.

Since its launch in November last year, ChatGPT has been used by countless people. People have been asking for more data access to this large language model in various forms. On March 24, OpenAI finally announced that it has partially lifted ChatGPT’s inability to connect to the Internet.

OpenAI’s solution is to use third-party plug-ins as a bridge to allow AI to “see” external data in a safer environment. Yesterday the agency opened up the list of the first batch of ChatGPT plugins. This batch of plugins was created by Expedia, FiscalNote, Instacart, KAYAK, Klarna, Milo, OpenTable, Shopify, Slack, Speak, Wolfram, and Zapier.

686b9bde83a25d16f2994532308f33ad.jpeg

Specifically, the plugin now lets you do the following with ChatGPT:

  • Retrieve real-time information: such as sports scores, stock prices, latest news, etc.;

  • Retrieve knowledge base information: such as company documents, personal notes, etc.;

  • Perform actions on behalf of the user: for example, book a flight, order a meal, etc.

In addition, OpenAI also provides two plug-ins, including a web browser and a code interpreter, and open sourced the code of a knowledge base retrieval plug-in. Now, any developer can build their own plugins to enhance ChatGPT’s information base.

The access to the alpha version of the plug-in has been extended to more users and developers on the waiting list. Although OpenAI stated that it will give priority to a small number of developers and ChatGPT Plus users, it also plans to open it on a larger scale in the future.

To give you an intuitive feeling, here is an example: users can choose and install Wolfram plug-ins from ChatGPT to improve their computational intelligence.

f1509f45c8f1cf76ed84af6ca03da1a6.png

The data in Wolfram Alpha comes from major academic websites, publications, and scientific institutions, and the professionalism is absolutely guaranteed. Has this ChatGPT made you feel stronger?

Overview

Although today’s large language models have been able to complete various tasks, their effects are still limited. The training data is the only information they can learn from, which may be out of date, but adapt to all people’s needs. Furthermore, the only out-of-the-box capability of a language model is to output text. This text may contain helpful instructions, but it takes a lot more processing from a human to actually follow them.

While not a perfect analogy, plugins can be the “eyes and ears” of a language model, allowing the language model to access new, private, or specific information not contained in the training data.

In response to explicit user requests, plugins can also enable language models to perform safe, restricted operations on their behalf, increasing the usefulness of the overall system.

OpenAI expects that there will be an open standard for unifying applications to interact with AI, and they are making an early attempt at such a standard.

Today, OpenAI began to gradually open the plug-ins built by OpenAI’s early collaborators for ChatGPT users, first covering ChatGPT Plus subscribers, and also began to introduce the ability for developers to create their own plug-ins for ChatGPT.

In the coming months, as the security system improves, OpenAI plans to enable developers using OpenAI models to integrate plugins into their own applications, not just ChatGPT.

Security and wider implications

Of course, connecting language models to external tools opens up new opportunities, but also significant new risks.

Plugins offer the potential to address various challenges associated with large language models, including large model “illusion”, tracking recent events, and accessing (licensed) proprietary sources of information. By integrating explicit access to external data, such as online up-to-date information, code-based computations, or information retrieved by custom plugins, language models can enhance their responses with evidence-based reference content.

These references not only enhance the usefulness of the model, but also enable users to assess the trustworthiness of the model output and double-check its accuracy, potentially mitigating the risks associated with over-dependence discussed in the recent GPT-4 system card. Finally, the value of plugins may transcend existing limitations by helping users handle a variety of new use cases, from browsing product catalogs to booking flights or ordering food.

But at the same time, add-ons may take harmful or inadvertent actions, increasing the ability of bad actors to defraud, mislead, or abuse others, thereby increasing the security challenge. By increasing the range of possible applications, plugins may increase the risk of negative consequences of wrong or misaligned actions taken by the model in new domains.

These factors guide the development of the ChatGPT plug-in platform, and OpenAI has introduced a number of safeguards for this.

Previously, OpenAI had conducted “red team exercises” both internally and with external collaborators, rehearsing many possible relevant scenarios. For example, red teams have seen plugins performing sophisticated prompt injections, sending fraudulent emails and spam, bypassing security restrictions, or abusing information sent to plugins if they are published without security measures in place.

OpenAI is using these findings to drive security design mitigations to limit risky plugin behavior and increase transparency of how and when they operate as part of the user experience, in addition to confirming decisions to gradually deploy access to plugins .

Plugins can have wide-ranging social impact. For example, in a recently published paper, OpenAI researchers found that language models with tools could have a greater economic impact than language models without tools, and more generally, following findings from other researchers, current The wave of artificial intelligence technology will have a big impact on the speed at which jobs are transformed, replaced and created.

Let ChatGPT browse the web

Inspired by a series of works such as WebGPT, GopherCite, BlenderBot2, LaMDA2, etc., allowing language models to read information from the Internet will strictly expand the scope of content that can be discussed, beyond the training corpus, and incorporate current fresh information.

The image below is an example where it can be seen that browsing opens up an experience for ChatGPT users where a previous model might politely point out that its training data did not include enough information for it to answer. And in this example, ChatGPT retrieves the most recent Academy Award information (on March 13, 2023), and then performs a familiar ChatGPT poetry performance. Browsing becomes a way to add to the experience.

Q: Can you tell me which person/movie has won an Oscar in these categories?

  • best actor

  • best soundtrack

  • best film

  • Best Supporting Actor

Then come up with a poem to tie them all together.

ChatGPT will give you a series of search results, and you can click directly to view relevant information sources.

593df8c2fd9ed47fdab06c8a2776538d.jpeg

3bb64c48372e89378e54cf2bfa6cc2b0.png

In addition to providing practical value to end users, language and chat models enable thorough and interpretable research, showing the promise of scalable alignment work.

It’s worth noting that the plug-in’s text-based web browser is limited to making GET requests, which reduces but doesn’t eliminate certain classes of security risks. Browsing plugins are scoped to be useful for retrieving information, but not for “transactional” operations such as form submissions — operations that pose a greater security risk.

The browse feature uses the Microsoft Bing Search API to retrieve content from the web. As such, this feature inherits a lot of Microsoft’s work on source reliability and information authenticity, as well as a “safety model” that prevents retrieval of questionable content. The plugin runs in a separate service, and ChatGPT’s browsing activity is separated from the rest of the infrastructure.

In order to respect content creators and comply with web norms, the ChatGPT browser plugin’s user agent token is ChatGPT-User and is configured to respect the site’s robots.txt file. Occasionally it may result in a “failed click”, which indicates that the plugin is obeying the site’s instructions to avoid crawling it. This user agent will only be used to take direct action on behalf of ChatGPT users and not to crawl the web by any automated means. OpenAI also publishes IP egress ranges and implements rate limiting to avoid sending excessive traffic to the website.

03a3675b58e61924ff7e8e690db26109.jpeg

Code Interpreter

An experimental ChatGPT model that can handle uploads and downloads in Python

OpenAI provides the model behind ChatGPT with a Python interpreter that works in a sandboxed, firewalled execution environment, and some temporary disk space. Code run by the interpreter plugin is evaluated in a persistent session that is active for the duration of the chat session (with a capped timeout), and subsequent calls can build on each other. Currently, this feature supports uploading files to the current session workspace and downloading work results.

3cfa0528a1cda070078bdc186d7f6498.jpeg

Click Finished Calculating in the figure:

2df239a89c610b7d67d6c52f14659484.png

From initial user research, OpenAI identified some valuable use cases for code interpreters:

  • Solve quantitative and qualitative mathematical problems

  • Perform data analysis and visualization

  • Convert files between formats

Security mechanism

The first precaution for connecting a ChatGPT model to a programming language interpreter is to properly sandbox the execution so that the AI-generated code does not have unintended side effects in the real world. OpenAI executes code in a secure environment and uses strict network controls to prevent external internet access from executing code. Additionally, OpenAI imposes resource limits on a per-session basis.

Disabling internet access limits the functionality of the code sandbox, but is probably the safest initial form of AI-assisted programming. Third-party plugins are designed with security in mind before connecting ChatGPT to the outside world.

Data Acquisition

Open-source search plugins enable ChatGPT to access individual or organizational sources of information with permission. It allows users to retrieve the most relevant document fragments from their data sources, such as files, notes, emails, or public documents, by asking questions or expressing their needs in natural language.

As an open-source and self-hosted solution, developers can deploy their own version of the plugin and register it on ChatGPT. Its plugin leverages OpenAI embedding and allows developers to choose a vector database such as Milvus, Pinecone, Qdrant, Redis, Weaviate, or Zilliz to index and search documents. Feeds can be synchronized with the database using webhooks.

The search plugin allows ChatGPT to search the database of content vectors and add the best results to the ChatGPT session. This means it is not subject to any external influence and the main risks are data authorization and privacy. Developers should only add content to search plugins they are authorized to use and can share within a user’s ChatGPT session.

Third-party plug-ins

Using third-party plugins on ChatGPT works like this:

a9449809095180535bf48ea9e4ca347a.jpeg

OpenAI prepares manifest files for third-party plugins, which include machine-readable descriptions of plugin functionality, calling methods, and user-facing documentation.

{
  "schema_version": "v1",
  "name_for_human": "TODO Manager",
  "name_for_model": "todo_manager",
  "description for human": "Manages your TODOs!",
  "description_for_model": "An app for managing a user's TODOs",
  "api": { "url": "/openapi.json" },
  "auth": { "type": "none" },
  "logo_url": "https://example.com/logo.png",
  "legal_info_url": "http://example.com",
  "contact_email": "[email protected]"
}

Steps to build a plugin:

1. Build the API endpoint you want the language model to call (could be a new API, an existing API, or a wrapper around an existing API designed specifically for LLM).

2. Create an OpenAPI specification that documents your API, and a manifest file that links to the OpenAPI specification and contains some plugin-specific metadata.

When starting a conversation on chat.openai.com, users can choose which third-party plugins they want to enable. Documentation about enabled plugins is presented to the language model as part of the dialog context, enabling the model to call the appropriate plugin API as needed to fulfill the user intent. Currently, plugins are designed to call backend APIs, and OpenAI is exploring plugins that can also call client APIs.

OpenAI says they are working hard to develop the plugin and roll it out to a wider audience.

This also means that the storm brought by ChatGPT is sweeping everything.

Reference content:

https://openai.com/blog/chatgpt-plugins

This article is only for academic sharing. If there is any infringement, please contact to delete the article.

Click to enter->3D Visual Workshop Learning Exchange Group

Dry goods download and learning

Background reply: BarceloThatAutonomous University courseware, you can download foreign universities 3D Vison excellent courseware accumulated for several years

Background reply: Computer VisionBooks,You can download the pdf of classic books in the field of 3D vision

Backstage reply: 3D vision courses, you can learn excellent courses in the field of 3D vision

Official website of 3D Visual Workshop Quality Course:3dcver.com

1. 3D point cloud target detection full-stack learning route for the field of autonomous driving! (single-modal + multi-modal/data + code)
2. Thoroughly understand the visual 3D reconstruction: principle analysis, code explanation, and optimization and improvement
3. The first domestic point cloud processing course for industrial-level combat
4. Laser-vision-IMU-GPS fusion SLAM algorithm combing and code explanation
5. Thoroughly understand visual-inertial SLAM: Based on VINS-Fusion, the class is officially started
6. Thoroughly understand 3D laser SLAM based on LOAM framework: source code analysis to algorithm optimization
7. Thoroughly analyze the key algorithm principles, codes and actual combat of indoor and outdoor laser SLAM (cartographer + LOAM + LIO-SAM)

8. Build a structured light 3D reconstruction system from scratch [theory + source code + practice]

9. Monocular depth estimation method: algorithm combing and code implementation

10. Deployment of deep learning models in autonomous driving

11. Camera model and calibration (monocular + binocular + fisheye)

12. Heavy! Quadrotor: Algorithms and Practice

13. ROS2 from entry to mastery: theory and practice

14. The first domestic 3D defect detection tutorial: theory, source code and practice

15. Introduction and practical tutorial of point cloud processing based on Open3D

16. Thorough understanding of visual ORB-SLAM3: theoretical basis + code analysis + algorithm improvement

17. Robotic arm grabbing from entry to actual combat

Heavy! Fan LearningCommunication Grouphas been established

The communication groups mainly include 3D vision, CV & deep learning, SLAM, 3D reconstruction, point cloud post-processing, automatic driving, multi-sensor fusion, CV introduction, 3D measurement, VR/AR, 3D face recognition, medical treatment Image, defect detection, pedestrian re-identification, target tracking, visual product landing, visual competition, license plate recognition, hardware selection, ORB-SLAM series source code exchange, depth estimation, TOF, job exchange and other directions.

Scan the following QR code, add a small assistant WeChat (dddvisiona), be sure to note: research direction + school/company + nickname, for example: “3D vision + Shanghai Jiaotong University + Quietly”. Please follow the format to make notes, and you can be quickly approved and invited into the group. Original contribution Please also contact.

fc067a122da488a7fbb8660b42801813.jpeg

▲Long press to add WeChat group or contribute, WeChat ID: dddvisiona

3D vision from entry to knowledge planet: video coursecourse for 3D vision field (3D reconstruction series, 3D point cloud series, structured light series, hand-eye calibration, camera calibration, laser/vision SLAM, automatic driving, etc.) , source code sharing,knowledge point summary, entry-level advanced learning route, latest paper Sharing, question answering, etc. for deep cultivation, and algorithm engineers from various major manufacturers for technical guidance. At the same time, the planet will cooperate with well-known companies to release 3D vision-related algorithm development jobs and projects docking information, and create a gathering area for die-hard fans integrating technology and employment. 6000+ planet members will create a better The AI world progresses together, the knowledge planet entrance:

Learn 3D vision core technology, scan and view, unconditional refund within 3 days

932c5efd28512c93b209241ea9fe686e.jpeg

High-quality tutorial materials, answering questions, helping you solve problems efficiently

If you find it useful, please give a like and watch~

The knowledge points of the article match the official knowledge files, and you can further learn related knowledge OpenCV skill tree Deep learning in OpenCVImage classification 14810 people are learning the system