PaddleNLP Natural Language Processing Knowledge Graph When using uie-x-base, uie-m-large, uie-m-base models, an error Out of memory error on GPU 0 gpu memory is not enough

  • Hi, I’m @ cargoyouxing
  • I’m interested in …
  • I’m currently learning…
  • ?I’m looking to collaborate on…
  • How to reach me…
    • README directory (continuously updated) Various error handling, crawler practice and templates, Baidu Intelligent Cloud face recognition, computer vision deep learning CNN image recognition and classification, PaddlePaddle natural language processing knowledge graph, GitHub, operation and maintenance…
    • WeChat: 1297767084
    • GitHub: https://github.com/cxlhyx

Article directory

  • PaddleNLP
  • PaddleNLP features
    • 1. Large model text generation
    • 2. Out-of-the-box NLP toolset: one-click UIE prediction
    • 3. Rich and complete Chinese model library
    • 4.? Industrial-grade end-to-end system example
    • 5. High-performance distributed training and inference
    • 6. More direct knowledge extraction links from knowledge graphs
  • PaddleNLP environment construction
  • My problem description
  • Cause Analysis
  • solution

PaddleNLP

Natural language processing nlp large language model knowledge graph construction question and answer system information retrieval knowledge extraction

PaddleNLP is an easy-to-use and powerful natural language processing and large language model (LLM) development library. It aggregates high-quality pre-trained models in the industry and provides an out-of-the-box development experience. The model library covering multiple NLP scenarios and industry practice examples can meet the flexible customization needs of developers. The advantage of PaddleNLP is that it provides one-click prediction function without training. You can directly input data to open domain extraction results.

paddlenlp GitHub link: PaddleNLP

PaddleNLP function

1. Large model text generation

PaddleNLP provides a convenient and easy-to-use Auto API that can quickly load models and Tokenizers.

2. Out-of-the-box NLP toolset: one-click UIE prediction

Taskflow provides a wealth of out-of-the-box industrial-grade NLP preset models, covering the two major scenarios of natural language understanding and generation, providing industrial-grade effects and ultimate reasoning performance.
PaddleNLP one-click prediction function: Taskflow API
PaddleNLP provides one-click prediction function. No training is required. Just input data directly to open domain extraction results.
It is worth noting that this function can be used to achieve entity recognition, relationship extraction, and time extraction required for knowledge graph construction.

One Prediction

3. Rich and complete Chinese model library

The most comprehensive Chinese pre-training model in the industry
Selected 45+ network structures and 500+ pre-training model parameters, covering the most comprehensive Chinese pre-training models in the industry: including Wenxin NLP large model ERNIE, PLATO, etc., as well as mainstream BERT, GPT, RoBERTa, T5, etc. structure. One-click “high-speed download” through AutoModel API.

Application examples with full scene coverage
Covers NLP application examples from academia to industry, covering NLP basic technology, NLP system applications and expanded applications. It is fully developed based on the new API system of Flying Paddle Core Framework 2.0, providing developers with the best practices in the field of Flying Paddle text.
For selected pre-trained model examples, please refer to Model Zoo, and for more scenario example documents, please refer to the examples directory. There is also a Notbook interactive tutorial on the AI Studio platform supported by free computing power to provide practice.

4.? Industrial-grade end-to-end system example

PaddleNLP provides end-to-end system examples for high-frequency NLP scenarios such as information extraction, semantic retrieval, intelligent question answering, and sentiment analysis, opening up the entire process of data annotation, model training, model tuning, and prediction deployment, and continues to lower the threshold for the implementation of the NLP technology industry. . For more detailed instructions on how to use system-level industrial examples, please refer to PaddleNLP. PaddleNLP provides multiple versions of industrial examples: Applications.
PaddleNLP provides multiple versions of industry examples:

Semantic retrieval system
For various data situations such as unsupervised data and supervised data, combined with SimCSE, In-batch Negatives, ERNIE-Gram single-tower model, etc., a cutting-edge semantic retrieval solution is launched, including recall and sorting links, to open up training, tuning, and efficiency The entire process of vector search engine database construction and query.

? Intelligent question and answer system and document intelligent question and answer
The retrieval question and answer system based on RocketQA technology supports multiple business scenarios such as FAQ Q&A and manual Q&A.

Comment opinion extraction and sentiment analysis
Based on the emotional knowledge enhanced pre-training model SKEP, evaluation dimensions and opinion extraction are performed for product reviews, as well as fine-grained sentiment analysis.

? Intelligent voice command analysis
It integrates PaddleSpeech and Baidu open platform’s speech recognition and UIE general information extraction technologies to create an example of an intelligent integrated voice command analysis system. This solution can be applied to scenarios such as intelligent voice form filling, intelligent voice interaction, and intelligent voice retrieval to improve Human-computer interaction efficiency.

5. High-performance distributed training and inference

? FastTokenizer: high-performance text processing library
In order to achieve more extreme model deployment performance, after installing FastTokenizers, you only need to turn on the use_fast=True option on the AutoTokenizer API to call the high-performance word segmentation operator implemented in C++ and easily obtain text processing acceleration that exceeds Python by more than a hundred times. For more usage instructions, please refer to the FastTokenizer documentation.

FastGeneration: high-performance generation acceleration library
Simply turn on the use_fast=True option on the generate() API to easily obtain more than 5 times GPU acceleration on generative pre-training models such as Transformer, GPT, BART, PLATO, and UniLM. For more instructions, please refer to the FastGeneration document.

Fleet: Flying Paddle 4D hybrid parallel distributed training technology
For more instructions on the use of distributed training of hundreds of billions of AI models, please refer to GPT-3.

6. More direct knowledge extraction links from knowledge graphs

General Information Extraction Application
Text Universal Information Extraction UIE (Universal Information Extraction)
Document General Information Extraction UIE Taskflow Usage Guide
ERNIE-Health Chinese medical pre-training model
Use the pre-trained model Fine-tune in the medical field to complete the Chinese medical language understanding task

PaddleNLP environment construction

The environmental requirements of PaddleNLP are relatively complex and require us to have the following software or libraries:

1. cuda and cudnn, these are the most difficult points in building the PaddleNLP environment.
2. Paddlepaddle or paddle, in my understanding, these two are the same thing, and the installation method is the same.
3. paddlenlp, the installation of this should be the simplest.

It should be noted that the construction process of paddlenlp is quite complicated, and the installation version requirements of each software or library are strict.
The blogger uses: anaconda3-2022.10-Windows-x86_64 + pycharm-community-2023.2.3 + CUDA11.8 + cuDNN8.9. + paddle2.5.2 + paddlenlp2.6.1.post

Here you can refer to the paddlenlp installation tutorial by suibianshen2012 blogger. It should be possible to follow the steps step by step to correspond to the good version. It is recommended not to choose a version that is too new as it has many bugs.

My problem description

An error occurs when using uie-x-base, uie-m-large, uie-m-base models
The following error occurs when using Taskflow’s uie-x-base, uie-m-large, uie-m-base models, while other models are fine.

Out of memory error on GPU 0. Cannot allocate 70.312500MB memory on GPU 0, 1.999999GB memory has been allocated and available memory is only 0.000000B.

Please check whether there is any other process using GPU 0.

1. If yes, please stop them, or start PaddlePaddle on another GPU.
2. If no, please decrease the batch size of your model.

Cause analysis

An error message shows that the GPU memory is insufficient.

gpt told me that it can be solved by the following method

  1. Check if any other process is using GPU 0. If so, stop these processes and try restarting your model.
  2. If no other processes are using GPU 0, try reducing the model’s batch size. This may help reduce the amount of computation required, thus freeing up more memory.
  3. Check that the model is configured correctly. Make sure your model has the parameters required for training and inference correctly set up. Sometimes, incorrect parameter settings can cause out of memory issues.
  4. Check whether your operating system provides adequate memory management capabilities. For example, some operating systems may allow you to have some control over the memory used by a program while it is running. You can check these options in your operating system’s settings.
  5. If none of the above resolves the issue, you may want to consider upgrading your hardware configuration to better handle larger models.

It cannot be solved after trying, and since we are using the model to directly predict rather than train, it stands to reason that the GPU memory should not be insufficient.

Solution

Modify device_id once

Press Crtl and click on Taskflow to see its constructor.


You can see that if your computer only has cpu or device_id==-1, then cpu will be used for training or prediction. If your computer has gpu and you do not specify device_id, then gpu:0 will be used for training or prediction.

We are using gpu:0 and the error is reported, so you can set the Taskflow parameter device_id to switch to cpu or other gpu to run the program.

After setting the parameters, it can run correctly. Then I removed this parameter and found that it no longer reported an error. magic.