Large model deployment notes (21) Windows+ChatGLM3

I heard that ChatGLM3 was released. Zhang Xiaobai wanted to try it on Windows:

Install Git Large File Storage first:

Open git-lfs.github.com

Click Download

Install the downloaded file:

Open Git Bash:

Execution: git lfs install

At this point, Git LFS is installed successfully.

cd /f/

git clone https://huggingface.co/THUDM/chatglm3-6b

For well-known reasons, this road is blocked.

I can only take a look at it from modelscope:

Open https://modelscope.cn/models/ZhipuAI/chatglm3-6b/summary

Open the Anaconda PowerShell prompt

conda activate model310

Enter the python command line and execute:

from modelscope import snapshot_download
model_dir = snapshot_download("ZhipuAI/chatglm3-6b", revision = "v1.0.0")

The downloaded model file is placed in C:\Users\xishu\.cache\modelscope\hub\ZhipuAI\chatglm3-6b

Move it to F:\models\THUDM\chatglm3-6b

Back to F:\

git clone https://github.com/THUDM/ChatGLM3

I have no choice but to download the zip package and unzip it:

cd F:\ChatGLM3-main

conda deactivate

conda create -n chatglm3 python=3.10

conda activate chatglm3

pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

The CPU version of torch 2.1.0 must be installed here. Remember to use conda to change to the GPU version.

Open

according to:

conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

pip install chardet

Execute the following script on the Python command line:

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("\models\THUDM\chatglm3-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("\models\THUDM\chatglm3-6b", trust_remote_code=True, device='cuda')
model = model.eval()
response, history = model.chat(tokenizer, "Hello", history=[])
print(response)
response, history = model.chat(tokenizer, "What can you do", history=history)
print(response)
response, history = model.chat(tokenizer, "Tell me what kind of movie Forrest Gump is", history=history)
print(response)