Stable Diffusion stable-diffusion-webui ControlNet Lora

Stable Diffusion

Stable Diffusion is used to generate images from text, ControlNet is used to control composition, and LoRA is used to control style.

stable-diffusion-webui

Domestic acceleration official website:

mirrors/AUTOMATIC1111/stable-diffusion-webui · GitCode

Installation reference:

Stable Diffusion Installation and Common Errors (+ Lora Use) Latest Installation Tutorial in 2023_cycyc123’s Blog-CSDN Blog

ComfyUI

Nanny-level tutorial: Build a complete workflow of Stable Diffusion XL from 0 to 1 for AI painting_WeThinkIn’s Blog-CSDN Blog

StableDiffusion model resource exploration and consumption guide – Zhihu

Large model

The large model specifically refers to the standard latent-diffusion model. Has complete TextEncoder, U-Net, VAE.

Since it is very difficult to train a large model and requires extremely high graphics card computing power, most people will not train large models.

CKPT

The collection of trained pictures is called a model, or checkpoint.

The full name of CKPT is CheckPoint. It is a common format for complete models. The model size is relatively large. Generally, the size of a single model of the live-action version is about 7GB, and the size of the animation version is between 2-5G.

The early CKPT suffix was ckpt, but now the new CKPTsuffixes are safetensors

VAE

Full name: VAE stands for Variational autoencoder. Variational autoencoder is responsible for converting data in latent space into normal images.

Suffix format: The suffix is generally in .pt format.

Lora

It is a smaller painting model, a fine-tuning of a larger model. Unlike only one large model that can be selected for each painting, lora model can add one or even more based on the selected large model. The general volume is around tens to hundreds of megabytes.

The suffix name of Lora is also safetensors. For beginners, it is easy to misunderstand ckpt, so I will talk about how to install it.

Installation of CKPT and lora

For most users, there are two main model files available, one is CKPT and the other is lora.

The best model download website is the legendary C station, http://civitai.com (requires scientific Internet access).

The installation path of CKPT is models\stable-diffusion. Copy it and use it (just refresh, no need to restart the service).

./stable-diffusion-webui/models/Stable-diffusion

The installation path of Loar is easy to make mistakes. The installation directory of Stable Diffusion has models\lora by default, but many tutorials remind that it is not the lora directory.

.stable-diffusion-webui/models/Lora

But another one: extensions\sd-webui-additional-networks\models\lora

However, in fact, the decompressed Stable Diffusion does not have this latter path by default, and you do not need to create it manually. You need to perform the following operations on the Stable Diffusion web interface.

AI study notes | Making digital life more real: model (chekpoint) and fine-tuning model (lora) – Zhihu

Stable Diffusion XL

refer to

A complete and in-depth analysis of the core basic knowledge of Stable Diffusion XL (SDXL) – Zhihu

The above table is a comparison between Stable Diffusion XL and the previous Stable Diffusion series. It can be seen from it that the U-Net parameter amount of Stable DiffusionV1.4/1.5 is only 860M. Even Stable DiffusionV2.0/2.1, its parameter amount is only 865M. . But when it comes to Stable Diffusion XL, the parameter amount of the U-Net model (Base part) has increased to 2.6B, the parameter amount has increased by about 3 times.

There are currently four frameworks that can load the Stable Diffusion XL model and generate images:

  1. ComfyUI framework

  2. SD.Next Framework

  3. Stable Diffusion WebUI Framework

  4. diffusersframework

ControlNet

ControlNet is a neural network structure that controls the diffusion model by adding additional conditions. It provides a way to enhance stable diffusion using conditional inputs such as scribbles, edge maps, segmentation maps, pose keypoints, etc. in the text-to-image generation process. The generated image will be closer to the input image, which is a great improvement over traditional image-to-image generation methods.

ControlNet models can be trained using small data sets. Then integrate any pre-trained stable diffusion model to enhance the model to achieve fine-tuning.

  • The initial version of ControNet comes with the following pretrained weights.
  • Canny edge – A monochrome image with a white edge on a black background.
  • Depth/Shallow areas – Grayscale image, with black representing dark areas and white representing shallow areas.
  • Normal map – Normal map image.
  • Semantic segmentation map – Segmentation image of ADE20K.
  • HED edge – A monochrome image with a white soft edge on a black background.
  • Scribbles – Hand drawn monochrome scribble image with white outline on black background.
  • OpenPose (Pose Key)-OpenPose skeletal image.
  • M-LSD – A monochrome image consisting only of white straight lines on a black background.

refer to

Use ControlNet to control Stable Diffusion-Tencent Cloud Developer Community-Tencent Cloud

ControlNet precision control AI painting tutorial – Nuggets

Detailed introduction to ControlNet – Zhihu

Resources

https://lexica.art/

lexica.art This website has text descriptions and pictures of millions of Stable Diffusion cases, which can provide everyone with enough creative inspiration. Can provide promt inspiration

https://civitai.com/

Civitai is a community that gathers AI drawing enthusiasts. There are many customized models on this website, especially targeted training for 3D, reality, characters and different painting styles. Therefore, when you use a specific model to generate images of a specific topic, the expressiveness is greatly enhanced.

Hugging Face – The AI community building the future.

HuggingFace is a website focused on building, training and deploying the latest models. These models are trained by individual developers and deployed to dedicated websites.

HuggingFace is the platform of choice for creators building AI models for Stable Diffusion. As of now, there are hundreds of models related to Stable Diffusion on the platform.