This article is from the AlStudio community boutique project, [click here] to view more boutique content >>>
Introduction
-
Using PaddleNLP and Vicuna to pre-train large language model parameters to implement LLaMA model-based conversational robots.
-
PS. Please use the V100 32G or higher environment for this project and run it with the latest development version of PaddlePaddle and PaddleNLP. The model parameters are for non-commercial use only.
-
UPDATE: Added Vicuna 13B model parameters, larger model and better results.
References
-
facebookresearch/llama
-
lm-sys/FastChat
-
lmsys/vicuna-7b-delta-v1.1
-
lmsys/vicuna-13b-delta-v1.1
-
PaddlePaddle/PaddleNLP
LLaMA model
-
LLaMa is a large language model open sourced by Meta.
-
Its full name is Large Language Model Meta AI, with parameters ranging from 7 billion to 65 billion.
-
For example, the 13 billion parameter LLaMA model outperforms the 175 billion parameter GPT-3 on most benchmarks and can run on a single V100 GPU.
-
And the largest LLaMA model with 65 billion parameters is comparable to Google’s Chinchilla-70B and PaLM-540B.
Vacuna model
-
Vicuna is the latest open source large model jointly released by scholars from UC Berkeley, CMU, Stanford and other institutions.
-
Based on Meta’s open-source LLaMA large model, it is fine-tuned using user-shared dialogue data on the ShareGPT platform.
-
Contains open source pre-trained models of 7B and 13B models.
Download model
- Download the corresponding model as needed, and download Vicuna 13B by default
# Download Vicuna 7B # !git lfs clone http://git.aistudio.baidu.com/180581/vicuna-7b-v1.1.git # Download Vicuna 13B !git lfs clone http://git.aistudio.baidu.com/180581/vicuna-13b-v1.1.git
Environment configuration
-
Since some features require the latest version of Paddle and PaddleNLP support
-
So you need to install the latest development version of Paddle and PaddleNLP
!pip install --pre --upgrade paddlenlp -f https://www.paddlepaddle.org.cn/whl/paddlenlp.html --user !pip install paddlepaddle-gpu==0.0.0.post112 -f https://www.paddlepaddle.org.cn/whl/linux/gpu/develop.html --user
Model loading
-
Create an LLaMA model
-
Load model parameters for Vicuna 7B / 13B
import os import glob import paddle from tqdm import tqdm from paddlenlp.transformers import LlamaForCausalLM, LlamaConfig, LlamaTokenizer pattern = 'paddle-model-?-of-?.pdparams' #Vicuna 7B # ckpt_dir = 'vicuna-7b-v1.1' # config_dict = {<!-- --> # "hidden_size": 4096, # "initializer_range": 0.02, # "intermediate_size": 11008, # "max_position_embeddings": 2048, # "model_type": "llama", # "num_attention_heads": 32, # "num_hidden_layers": 32, # "rms_norm_eps": 1e-06, # "vocab_size": 32000, # "bos_token_id": 1, # "eos_token_id": 2, # "pad_token_id": 0, # "use_cache": True, # "use_recompute": False, # "use_flash_attention": False, # } #Vicuna 13B ckpt_dir = 'vicuna-13b-v1.1' config_dict = {<!-- --> "hidden_size": 5120, "initializer_range": 0.02, "intermediate_size": 13824, "max_position_embeddings": 2048, "model_type": "llama", "num_attention_heads": 40, "num_hidden_layers": 40, "rms_norm_eps": 1e-06, "vocab_size": 32000, "bos_token_id": 1, "eos_token_id": 2, "pad_token_id": 0, "use_cache": True, "use_recompute": False, "use_flash_attention": False, } paddle.set_default_dtype('float16') tokenizer = LlamaTokenizer. from_pretrained(ckpt_dir) config = LlamaConfig(**config_dict) model = LlamaForCausalLM(config) model.eval() for name, layer in model.named_sublayers(): if 'rotary_emb' in name: layer.inv_freq = layer.inv_freq.cast(paddle.float32) paddle.device.cuda.empty_cache() for file_path in tqdm(glob.glob(os.path.join(ckpt_dir, pattern))): params = paddle. load(file_path) assert model.set_dict(params)[1] == [], 'Load error.' del params paddle.device.cuda.empty_cache()
Dialogue Bot
-
Realize the effect of the dialogue robot through the set Prompt.
'USER: {input}\ \ ASSISTANT: {output}'
input_text = input('USER: ') prompt = f'''USER: {<!-- -->input_text}\ \ ASSISTANT: ''' with paddle.no_grad(): with paddle.amp.auto_cast(False, level='O2', dtype='float16'): while True: if input_text == 'exit': break inputs = tokenizer( prompt, return_tensors="pd", return_attention_mask=True, return_position_ids=True ) outputs = model. generate( input_ids = inputs. input_ids, attention_mask = inputs. attention_mask, position_ids = inputs. position_ids, max_length=2048-inputs.input_ids.shape[1], min_length=0, decode_strategy="sampling", temperature=0.8, top_k=40, top_p=0.95, repetition_penalty=1.1, bos_token_id=tokenizer.bos_token_id, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.pad_token_id, use_cache=True, use_fast=True, use_fp16_decoding=True) response = tokenizer.decode(outputs[0][0], skip_special_tokens=True) print('ASSISTANT: ' + response) input_text = input('USER: ') prompt + = f'''{<!-- -->response}\ \ USER: {<!-- -->input_text}\ \ ASSISTANT: ''' del inputs del outputs del response paddle.device.cuda.empty_cache()
USER: Hello, who are you? ASSISTANT: Hello! I am Assistant, a language model. USER: Write a poem about artificial intelligence ASSISTANT: Artificial intelligence is so amazing it can think about complex problems Possess a wealth of knowledge and information can help us solve the problem However, it also has limitations Inability to feel emotion and real experience just a highly intelligent machine So, although we can use it to get various services But we also have to realize it's not a human replacement It is an important means of co-existing with human beings Therefore, we should learn how to work with Take advantage of its advantages and overcome its disadvantages make this world a better place This is the mission of artificial intelligence USER: Help me write a code to read jpg images using python and opencv. ASSISTANT: When you want to read JPEG images, you can use the PIL library or OpenCV library in Python. Here is a sample code for reading a JPEG image using the OpenCV library: ```python import cv2 # open the image image = cv2.imread('example.jpg') # display image cv2.imshow('Image', image) cv2.waitKey(0) cv2.destroyAllWindows() ``` In the above code, `cv2.imread()` function is used to read JPEG image and save it in `image` variable. Then, you can use the `cv2.imshow()` function to display the image on the console, and use the `cv2.waitKey()` function to wait for the user to press a key event. Finally, you can use the `cv2.destroyAllWindows()` function to release the image and close the graphics window. USER: Where is the capital of China? ASSISTANT: The capital of China is Beijing. USER: What about America? ASSISTANT: The capital of the United States is Washington. USER: exit
Summary
-
Through the above dialogue example, it can be seen that the effect of Vicuna 13B is not bad. It can understand Chinese better, and complete basic dialogues with the context and users, and can also complete simple code writing.
-
However, Vinuca has not achieved amazing results, and there is still a certain gap with business models such as Wenxinyiyan and ChatGPT.
This article is moved
Original project link