ChatGLM3 is a new generation of dialogue pre-training model jointly released by Zhipu AI and Tsinghua University’s KEG Laboratory. ChatGLM3-6B is an open source model in the ChatGLM3 series. On the basis of retaining many excellent features of the previous two generations of models such as smooth dialogue and low deployment threshold, ChatGLM3-6B introduces the following features:
1. More powerful basic model: The basic model of ChatGLM3-6B, ChatGLM3-6B-Base, uses more diverse training data, more sufficient training steps, and a more reasonable training strategy. Evaluations on data sets from different perspectives such as semantics, mathematics, reasoning, code, and knowledge show that ChatGLM3-6B-Base has the strongest performance among basic models below 10B.
2. More complete function support: ChatGLM3-6B adopts a newly designed Prompt format, in addition to normal multi-round conversations. At the same time, it natively supports complex scenarios such as tool calling (Function Call), code execution (Code Interpreter), and Agent tasks.
3. A more comprehensive open source sequence: In addition to the dialogue model ChatGLM3-6B, the basic model ChatGLM3-6B-Base and the long text dialogue model ChatGLM3-6B-32K are also open sourced. All the above rights are completely open to academic research, and free commercial use is also allowed after filling in the questionnaire for registration.
Here we use ChatGLM3-6B for testing, in two ways, one is a separate web test, the other is using Text-generation-webui
First, we clone https://github.com/THUDM/ChatGLM3.git, then create a virtual environment python -m venv venv, activate the virtual environment venv\Scripts\activate, and then install the corresponding dependency packages
pip install -r requirements -i https://pypi.tuna.tsinghua.edu.cn/simple
Finally, download the model. For those who are inconvenient to use huggingface, you can download it in modelscope:
from modelscope import snapshot_download model_dir = snapshot_download("chatglm3-6b", revision = "v1.0.0", cahe_dir="your model path")
Modify the code of web_demo.py as follows:
tokenizer = AutoTokenizer.from_pretrained(“your model path”, trust_remote_code=True)
model = AutoModel.from_pretrained(“your model path”, trust_remote_code=True).cuda()
Then you can start python web_demo.py:
Due to graphics card limitations, the generation speed is slightly slower, but the generated content is still much more detailed than Claude2. The following content was generated by Claude2
Okay, here is a code example of a simple weather forecast neural network based on LSTM and PyTorch: ```python import torch import torch.nn as nn from torch.utils.data import Dataset, DataLoader #Create Dataset to load weather data class WeatherDataset(Dataset): def __init__(self, x, y): self.x = x self.y = y def __len__(self): return len(self.x) def __getitem__(self, idx): return self.x[idx], self.y[idx] #Define LSTM model class WeatherForecastLSTM(nn.Module): def __init__(self, input_size, hidden_size, num_layers, output_size): super(WeatherForecastLSTM, self).__init__() self.lstm = nn.LSTM(input_size, hidden_size, num_layers) self.fc = nn.Linear(hidden_size, output_size) def forward(self, x): out, _ = self.lstm(x) out = out[-1, :, :] out = self.fc(out) return out # Prepare training data train_x = # Temperature data for the past 7 days train_y = # Day 8 temperature train_data = WeatherDataset(train_x, train_y) train_loader = DataLoader(train_data, batch_size=64, shuffle=True) #Create LSTM model input_size = 1 # Temperature data seq_length = 7 # 7 days of data num_layers = 1 hidden_size = 32 output_size = 1 # Predict the temperature on the 8th day model = WeatherForecastLSTM(input_size, hidden_size, num_layers, output_size) # Define loss function and optimizer criterion = nn.MSELoss() optimizer = torch.optim.Adam(model.parameters(), lr=0.001) # train num_epochs = 100 for epoch in range(num_epochs): for batch_x, batch_y in train_loader: outputs = model(batch_x) loss = criterion(outputs, batch_y) optimizer.zero_grad() loss.backward() optimizer.step() print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch + 1, num_epochs, loss.item())) ``` This model uses the LSTM network to learn the long-term dependencies of weather time series data and can be used to predict future weather. We can adjust the model structure based on different actual weather data, add more features, and further improve forecast accuracy.
Below is the content generated by ChatGLM3
To use LSTM and PyTorch to implement a weather forecasting neural network, you need to complete the following steps: 1. Prepare data: First, you need to prepare a data set containing dates and corresponding temperature, humidity, wind speed and other meteorological data. Datasets can be CSV files, Excel files, or other formats. Make sure that each date in the data set has corresponding meteorological data such as temperature, humidity, wind speed, etc. 2. Data preprocessing: Sort the data set by date and convert the date into a Tensor in PyTorch. Add the corresponding_to field of meteorological data such as temperature, humidity, wind speed, etc. to the Tensor of each sample. 3. Define the model: Define a neural network model containing LSTM layers. LSTM is a neural network suitable for processing time series data. Add a fully connected layer to the model, pass the output of the LSTM layer to the fully connected layer, and use the softmax function to output the predicted weather conditions (such as cloudy, sunny, rainy, etc.). 4. Train the model: Input the prepared data into the model and train using the cross-entropy loss function and stochastic gradient descent (SGD) optimizer. During the training process, you need to set some hyperparameters such as learning rate, batch size, number of iterations, etc. 5. Validate the model: Use the validation set to evaluate the performance of the model. Calculate the model's accuracy, loss function value and other indicators on the validation set to determine whether the model is overfitting or underfitting. 6. Prediction: Use the trained model to predict new dates. Convert the new date to a Tensor, then feed the Tensor into the model to get the predicted weather conditions. Here is a simple example code: import torch import torch.nn as nn import torch.optim as optim from sklearn.preprocessing import LabelEncoder from torch.utils.data import Dataset, DataLoader import pandas as pd #Data preprocessing def preprocess_data(data): data['date'] = pd.to_datetime(data['date']) data['temperature'] = data['temperature'].astype('float32') data['humidity'] = data['humidity'].astype('float32') data['wind_speed'] = data['wind_speed'].astype('float32') encoder = LabelEncoder() data['weather'] = encoder.fit_transform(data['weather']) data = data.drop(['weather', 'date'], axis=1) return data class WeatherDataset(Dataset): def __init__(self, data): self.data = data def __len__(self): return len(self.data) def __getitem__(self, index): return self.data.iloc[index] # Define model class WeatherModel(nn.Module): def __init__(self, input_size, hidden_size, output_size): super(WeatherModel, self).__init__() self.hidden_size = hidden_size self.lstm = nn.LSTM(input_size, hidden_size) self.fc = nn.Linear(hidden_size, output_size) def forward(self, x): lstm_out, _ = self.lstm(x) out = self.fc(lstm_out[:, -1, :]) return out #Train model def train_model(model, device, train_loader, val_loader, optimizer, epochs, lr): model.to(device) criterion = nn.CrossEntropyLoss() optimizer = optim.Adam(model.parameters(), lr=lr) for epoch in range(epochs): model.train() for batch_idx, (data, target) in enumerate(train_loader): data, target = data.to(device), target.to(device) optimizer.zero_grad() output = model(data) loss = criterion(output, target) loss.backward() optimizer.step() model.eval() with torch.no_grad(): total_correct=0 total_samples = 0 for data, target in val_loader: data, target = data.to(device), target.to(device) output = model(data) _, predicted = torch.max(output.data, 1) total_correct + = (predicted == target).sum().item() total_samples + = target.size(0) accuracy = total_correct / total_samples print(f'Epoch {epoch + 1}, Validation Accuracy: {accuracy}') # predict def predict_weather(model, device, test_loader): model.eval() predictions = [] with torch.no_grad(): for data, _ in test_loader: data = data.to(device) output = model(data) _, predicted = torch.max(output.data, 1) predictions.extend(predicted.cpu().numpy()) return predictions # Main function def main(): #Read data data = pd.read_csv('weather_data.csv') train_data = preprocess_data(data) train_dataset = WeatherDataset(train_data) val_data = preprocess_data(data) val_dataset = WeatherDataset(val_data) test_data = preprocess_data(data) test_dataset = WeatherDataset(test_data) # Define model input_size = train_data.shape[1] hidden_size = 128 output_size = len(train_data['weather'].unique()) model = WeatherModel(input_size, hidden_size, output_size) # equipment device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') # Data loading train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True) val_loader = DataLoader(val_dataset, batch_size=32, shuffle=True) test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False) #Train model optimizer = optim.Adam(model.parameters(), lr=0.001) train_model(model, device, train_loader, val_loader, optimizer, 10, 0.001) # predict predictions = predict_weather(model, device, test_loader) print(predictions) if __name__ == '__main__': main() This example code is just a basic framework, you will need to adjust it accordingly to your data set and needs.
Then we call this model in text-generation-webui. You can refer to my previous picture on how to elegantly use various LLMs for configuration. We put the downloaded model into the models folder, and then start python server.py — trust-remote-code
Overall, the results are in line with expectations
Interested students can experience it on their own.