How to use Trend Cloud GPU to run AI project online – deploy the latest ChatGLM3-6B model

Learning tutorial 1. Get free computing power Datawhale exclusive registration: Trend Cloudhttps://growthdata.virtaicloud.com/t/vs 2. Deploy the latest ChatGLM3-6B model 1. Create project After creating an account, enter your own space and click Create Project in the upper right corner. Give the project a name you like and choose local code Mirror selection pytorch2.0.1, python3.9 Select the […]

Reasons and solutions for GPU CUDA running speed not to increase but to decrease when using shared memory

I wrote several operators for adding two images and image filtering, respectively using shared memory for optimization. #include <stdio.h> #include <cuda_runtime.h> #include “helper_cuda.h” #include “helper_timer.h” #define BLOCKX 32 #define BLOCKY 32 #define BLOCK_SIZE 1024 #define PADDING 2 __global__ void filter5x5(float* in, float* out, int nW, int nH) {<!– –> // Thread index —> Global memory […]

PaddleNLP Natural Language Processing Knowledge Graph When using uie-x-base, uie-m-large, uie-m-base models, an error Out of memory error on GPU 0 gpu memory is not enough

Hi, I’m @ cargoyouxing I’m interested in … I’m currently learning… ?I’m looking to collaborate on… How to reach me… README directory (continuously updated) Various error handling, crawler practice and templates, Baidu Intelligent Cloud face recognition, computer vision deep learning CNN image recognition and classification, PaddlePaddle natural language processing knowledge graph, GitHub, operation and maintenance… […]

2 Task 2: Use Tencent Cloud GPU to conduct cat and dog recognition practice

Practice of cat and dog recognition using TrendCloud GPU 1 Create project 2 Initialize the development environment 3 Debugging code 4 Submit offline tasks 5 Result set storage and download Use the free GPU provided by Tendong Cloud to practice cat and dog recognition. Although the routines provided are based on tensorflow, you can also […]

Lightweight encapsulation WebGPU rendering system example <21> – 3D rendering of cellular automaton Game of Life (source code)

Implementation principle: basic PBR lighting and gpu compute calculation Try to use a data-based/semantic loose description of data to present relevant object logic. Current sample source code github address: https://github.com/vilyLei/voxwebgpu/blob/feature/rendering/src/voxgpu/sample/GameOfLife3DPBR.tsCurrent sample running effect: Other effect screenshots: This example is implemented based on this rendering system. The current example TypeScript source code is as follows: const […]

Lightweight encapsulated WebGPU rendering system example <11> – WebGPU simple PBR effect (source code)

Current sample source code github address: https://github.com/vilyLei/voxwebgpu/blob/main/src/voxgpu/sample/SimplePBRTest.ts Features implemented by this sample rendering system: 1. Isolation of user state and system state. For details, please see: Engine System Design Ideas – Isolation of User State and System State – CSDN Blog 2. Isolate high-frequency calls from low-frequency calls. 3. User-oriented ease of use packaging. 4. […]

Configure GPU training environment (Anaconda) in pycharm (yolov5)

Table of Contents 1. Specific configuration process: 2. Create a virtual environment at the specified location (path): 3. Commonly used conda commands: 4: Some problems encountered when running the model: 4.1: conda added python interpreter cannot find the corresponding python.exe file 4.2: Error “OSError: [WinError 1455] The page file is too small and the operation […]

Analysis of GPU memory usage of CNN convolutional neural network model

1. Reference materials A brief discussion on deep learning: how to calculate the memory usage of models and intermediate variables How to make precise use of video memory in Pytorch 2. Related introduction 0. Preliminary knowledge For the convenience of calculation, this article performs unit conversion according to the following standards: 1G = 1000MB 1 […]

LLM – GPU computing power evaluation during training and inference

Table of Contents I. Introduction 2. FLOPs and TFLOPs ◆ FLOPs [Floating point Opearation Per Second] ◆ TFLOPs [Tera Floating point Operation Per Second] 3. GPU consumption during training phase ◆ Factors affecting training ◆ GPT-3 training statistics ◆ Custom training GPU evaluation 4. GPU consumption during inference phase ◆ Factors affecting reasoning ◆ Custom […]

Caffe-GPU+CUDA 8.0+CUDNN v5+CMAKE+Python2.7+VS2013+Win11 environment construction

I went through many pitfalls when setting up a Windows environment. Here I record the process and some solutions I tried to solve the problems. If possible, it is best to install an Ubuntu system on the machine to run it. 1. Environment (It does not necessarily follow my version of installation, it just provides […]