A more powerful bilingual conversation model-ChatGLM3

ChatGLM3 is a new generation of dialogue pre-training model jointly released by Zhipu AI and Tsinghua University’s KEG Laboratory. ChatGLM3-6B is an open source model in the ChatGLM3 series. On the basis of retaining many excellent features of the previous two generations of models such as smooth dialogue and low deployment threshold, ChatGLM3-6B introduces the following features:

1. More powerful basic model: The basic model of ChatGLM3-6B, ChatGLM3-6B-Base, uses more diverse training data, more sufficient training steps, and a more reasonable training strategy. Evaluations on data sets from different perspectives such as semantics, mathematics, reasoning, code, and knowledge show that ChatGLM3-6B-Base has the strongest performance among basic models below 10B.

2. More complete function support: ChatGLM3-6B adopts a newly designed Prompt format, in addition to normal multi-round conversations. At the same time, it natively supports complex scenarios such as tool calling (Function Call), code execution (Code Interpreter), and Agent tasks.

3. A more comprehensive open source sequence: In addition to the dialogue model ChatGLM3-6B, the basic model ChatGLM3-6B-Base and the long text dialogue model ChatGLM3-6B-32K are also open sourced. All the above rights are completely open to academic research, and free commercial use is also allowed after filling in the questionnaire for registration.

Here we use ChatGLM3-6B for testing, in two ways, one is a separate web test, the other is using Text-generation-webui

First, we clone https://github.com/THUDM/ChatGLM3.git, then create a virtual environment python -m venv venv, activate the virtual environment venv\Scripts\activate, and then install the corresponding dependency packages

pip install -r requirements -i https://pypi.tuna.tsinghua.edu.cn/simple

Finally, download the model. For those who are inconvenient to use huggingface, you can download it in modelscope:

from modelscope import snapshot_download
model_dir = snapshot_download("chatglm3-6b", revision = "v1.0.0", cahe_dir="your model path")

Modify the code of web_demo.py as follows:

tokenizer = AutoTokenizer.from_pretrained(“your model path”, trust_remote_code=True)

model = AutoModel.from_pretrained(“your model path”, trust_remote_code=True).cuda()

Then you can start python web_demo.py:

Due to graphics card limitations, the generation speed is slightly slower, but the generated content is still much more detailed than Claude2. The following content was generated by Claude2

Okay, here is a code example of a simple weather forecast neural network based on LSTM and PyTorch:


```python
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader


#Create Dataset to load weather data
class WeatherDataset(Dataset):
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __len__(self):
        return len(self.x)
    
    def __getitem__(self, idx):
        return self.x[idx], self.y[idx]


#Define LSTM model
class WeatherForecastLSTM(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, output_size):
        super(WeatherForecastLSTM, self).__init__()
        self.lstm = nn.LSTM(input_size, hidden_size, num_layers)
        self.fc = nn.Linear(hidden_size, output_size)
    
    def forward(self, x):
        out, _ = self.lstm(x)
        out = out[-1, :, :]
        out = self.fc(out)
        return out


# Prepare training data
train_x = # Temperature data for the past 7 days
train_y = # Day 8 temperature


train_data = WeatherDataset(train_x, train_y)
train_loader = DataLoader(train_data, batch_size=64, shuffle=True)


#Create LSTM model
input_size = 1 # Temperature data
seq_length = 7 # 7 days of data
num_layers = 1
hidden_size = 32
output_size = 1 # Predict the temperature on the 8th day


model = WeatherForecastLSTM(input_size, hidden_size, num_layers, output_size)


# Define loss function and optimizer
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)


# train
num_epochs = 100
for epoch in range(num_epochs):
    for batch_x, batch_y in train_loader:
        outputs = model(batch_x)
        loss = criterion(outputs, batch_y)
        
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
    print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch + 1, num_epochs, loss.item()))
```


This model uses the LSTM network to learn the long-term dependencies of weather time series data and can be used to predict future weather. We can adjust the model structure based on different actual weather data, add more features, and further improve forecast accuracy.

Below is the content generated by ChatGLM3

To use LSTM and PyTorch to implement a weather forecasting neural network, you need to complete the following steps:
1. Prepare data: First, you need to prepare a data set containing dates and corresponding temperature, humidity, wind speed and other meteorological data. Datasets can be CSV files, Excel files, or other formats. Make sure that each date in the data set has corresponding meteorological data such as temperature, humidity, wind speed, etc.
2. Data preprocessing: Sort the data set by date and convert the date into a Tensor in PyTorch. Add the corresponding_to field of meteorological data such as temperature, humidity, wind speed, etc. to the Tensor of each sample.
3. Define the model: Define a neural network model containing LSTM layers. LSTM is a neural network suitable for processing time series data. Add a fully connected layer to the model, pass the output of the LSTM layer to the fully connected layer, and use the softmax function to output the predicted weather conditions (such as cloudy, sunny, rainy, etc.).
4. Train the model: Input the prepared data into the model and train using the cross-entropy loss function and stochastic gradient descent (SGD) optimizer. During the training process, you need to set some hyperparameters such as learning rate, batch size, number of iterations, etc.
5. Validate the model: Use the validation set to evaluate the performance of the model. Calculate the model's accuracy, loss function value and other indicators on the validation set to determine whether the model is overfitting or underfitting.
6. Prediction: Use the trained model to predict new dates. Convert the new date to a Tensor, then feed the Tensor into the model to get the predicted weather conditions.
Here is a simple example code:




import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.preprocessing import LabelEncoder
from torch.utils.data import Dataset, DataLoader
import pandas as pd
#Data preprocessing
def preprocess_data(data):
    data['date'] = pd.to_datetime(data['date'])
    data['temperature'] = data['temperature'].astype('float32')
    data['humidity'] = data['humidity'].astype('float32')
    data['wind_speed'] = data['wind_speed'].astype('float32')
    encoder = LabelEncoder()
    data['weather'] = encoder.fit_transform(data['weather'])
    data = data.drop(['weather', 'date'], axis=1)
    return data
class WeatherDataset(Dataset):
    def __init__(self, data):
        self.data = data
    def __len__(self):
        return len(self.data)
    def __getitem__(self, index):
        return self.data.iloc[index]
# Define model
class WeatherModel(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(WeatherModel, self).__init__()
        self.hidden_size = hidden_size
        self.lstm = nn.LSTM(input_size, hidden_size)
        self.fc = nn.Linear(hidden_size, output_size)
    def forward(self, x):
        lstm_out, _ = self.lstm(x)
        out = self.fc(lstm_out[:, -1, :])
        return out
#Train model
def train_model(model, device, train_loader, val_loader, optimizer, epochs, lr):
    model.to(device)
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=lr)
    for epoch in range(epochs):
        model.train()
        for batch_idx, (data, target) in enumerate(train_loader):
            data, target = data.to(device), target.to(device)
            optimizer.zero_grad()
            output = model(data)
            loss = criterion(output, target)
            loss.backward()
            optimizer.step()
        model.eval()
        with torch.no_grad():
            total_correct=0
            total_samples = 0
            for data, target in val_loader:
                data, target = data.to(device), target.to(device)
                output = model(data)
                _, predicted = torch.max(output.data, 1)
                total_correct + = (predicted == target).sum().item()
                total_samples + = target.size(0)
            accuracy = total_correct / total_samples
            print(f'Epoch {epoch + 1}, Validation Accuracy: {accuracy}')
# predict
def predict_weather(model, device, test_loader):
    model.eval()
    predictions = []
    with torch.no_grad():
        for data, _ in test_loader:
            data = data.to(device)
            output = model(data)
            _, predicted = torch.max(output.data, 1)
            predictions.extend(predicted.cpu().numpy())
    return predictions
# Main function
def main():
    #Read data
    data = pd.read_csv('weather_data.csv')
    train_data = preprocess_data(data)
    train_dataset = WeatherDataset(train_data)
    val_data = preprocess_data(data)
    val_dataset = WeatherDataset(val_data)
    test_data = preprocess_data(data)
    test_dataset = WeatherDataset(test_data)
    # Define model
    input_size = train_data.shape[1]
    hidden_size = 128
    output_size = len(train_data['weather'].unique())
    model = WeatherModel(input_size, hidden_size, output_size)
    # equipment
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    # Data loading
    train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
    val_loader = DataLoader(val_dataset, batch_size=32, shuffle=True)
    test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)
    #Train model
    optimizer = optim.Adam(model.parameters(), lr=0.001)
    train_model(model, device, train_loader, val_loader, optimizer, 10, 0.001)
    # predict
    predictions = predict_weather(model, device, test_loader)
    print(predictions)
if __name__ == '__main__':
    main()


This example code is just a basic framework, you will need to adjust it accordingly to your data set and needs.

Then we call this model in text-generation-webui. You can refer to my previous picture on how to elegantly use various LLMs for configuration. We put the downloaded model into the models folder, and then start python server.py — trust-remote-code

Overall, the results are in line with expectations

Interested students can experience it on their own.