How to run a stable diffusion model locally

Recommended automatic texture tools developed based on stable diffusion AI model:
DreamTexture.js automatic texturing development kit – NSDT

Following DALL-E 2 and Imagen, the new deep learning model Stable Diffusion marks a huge leap forward from text to image. Stable Diffusion, released earlier this month, promises to democratize text-conditioned image generation by running efficiently enough on consumer-grade GPUs. Just this Monday, the Stable Diffusion checkpoint was released for the first time, which means Now you can generate an image like the one below with just a few words and a few minutes of your time.

This article will show you how to install and run Stable Diffusion on both GPU and CPU so you can start building your own images. let’s start!

Want to know how Stable Diffusion works?

Check out our article on how physics is advancing generative AI for a visual explanation of diffusion models like Stable Diffusion.

Using Stable Diffusion in Colab

Before we look at how to install and run Stable Diffusion locally, you can check out the Colab notebook below to learn how to use Stable Diffusion non-locally. Please note that you will need Colab Pro to generate new images, as Colab‘s free version has slightly smaller VRAM and cannot sample.

Stable diffusion in Colab (GPU)

If you don’t have Colab Pro, you can also run Stable Diffusion on Colab’s CPU, but be aware that image generation will take a relatively long time (8-12 minutes):

Stable diffusion in Colab (CPU)

You can also check out our Stable Diffusion tutorial on YouTube for a walkthrough using a GPU notebook.

How to install Stable Diffusion (GPU)

You will need a UNIX-based operating system to follow this tutorial, so if you have a Windows computer, consider using a virtual machine or WSL2.

Step 1: Install Python

First, check if Python is installed on your system by typing in the terminal. If the Python version returns, continue to the next step. Otherwise, use python --version

sudo apt-get update
yes | sudo apt-get install python3.8

Step 2: Install Miniconda

Next, we need to make sure the package/environment manager conda is installed. Enter the terminal. If the conda version is returned, continue to the next step. conda --version

Otherwise, go to the conda website, download and run the Miniconda installer for your Python version and operating system. For Python3.8, you can download and run the installer using the following commands:

wget https://repo.anaconda.com/miniconda/Miniconda3-py38_4.12.0-Linux-x86_64.sh bash Miniconda3-py38_4.12.0-Linux-x86_64.sh

Press and hold the Enter key to pass the license, then type “yes” when prompted to continue. Next, press Enter to confirm the installation location, then type “yes” when asked if the installer should initialize Miniconda. Finally, close the terminal and open a new terminal to install Stable Diffusion.

Step 3: Clone the stable diffusion repository

Now we need to clone the Stable Diffusion repository. In the terminal, execute the following command:

git clone https://github.com/CompVis/stable-diffusion.git
cd stable-diffusion/

If you don’t have git, you’ll need to use .Be sure to read and accept the Stable Diffusion license before cloning the repository. sudo apt install git

Step 4: Create Conda environment

Next, we need to create a conda environment that contains all the packages needed to run Stable Diffusion. Execute the following command to create and activate this environment named ldm

conda env create -f environment.yaml
conda activate ldm

Step 5: Download Stable Diffusion Weights

Now that we are using Stable Diffusion in a proper environment, we need to download the weights required to run it. If you have not already read and accepted the Stable Diffusion License, be sure to do so now. Several Stable Diffusion checkpoint versions have been released. Higher version numbers have been trained on more data and generally perform better than lower version numbers. We will be using Checkpoint v1.4. Use the following command to download the weights:

curl https://f004.backblazeb2.com/file/aai-blog-files/sd-v1-4.ckpt > sd-v1-4.ckpt

That’s all the setup we need to get started with Stable Diffusion! Read on to learn how to use the model to generate images.

How to generate images with stable diffusion (GPU)

To use Stable Diffusion to generate images, open a terminal and navigate to the directory. Make sure you are in the correct environment by executing the command. stable-diffusionconda activate ldm

To build the image, run the following command:

python scripts/txt2img.py --prompt "YOUR-PROMPT-HERE" --plms --ckpt sd-v1-4.ckpt --skip_grid --n_samples 1

troubleshooting

Replace with the location of the title for which you want to generate the image (keep the quotes). Running this command with the prompt “Realistic vaporwave image of a lizard riding a snowboard through space” will output the following image: YOUR -PROMPT-HERE

The image above was generated in about a minute using Ubuntu 18.04 VM and NVIDIA Tesla K80 in GCP.

Script options

You can customize this script with several command line parameters to tailor the results to your needs. Let’s take a look at a few that might come in handy:

A --prompt sentence followed by quotes specifies the prompt for which the image is generated. The default value is “Painting of a viral monster playing guitar“.
--from-fileSpecifies the FilePath of the prompt file for which the imageis generated.

--ckpt is followed by a path specifying the model checkpoint to use. The default value is . models/ldm/stable-diffusion-v1/model.ckpt

--outdir followed by the path will specify the output directory where the generated images are to be saved. The default value is . outputs/txt2img-samples

--skip_grid will skip creating merged images.

--ddim_steps followed by an integer specifies the number of sampling steps in the diffusion process. Increasing this number will increase calculation time but may improve results. The default value is 50.

--n_samples followed by an integer specifies the number of samples to generate for each given prompt (batch size). The default value is 3.

--n_iter followed by an integer specifies the number of times to run the sampling loop. Effectively the same as but use it if you encounter OOM errors. See the source code for instructions. The default value is 2. --n_samples

--H followed by an integer specifies the height of the generated image in pixels. The default value is 512.

--W followed by an integer specifies the width of the generated image in pixels. The default value is 512.

--scale followed by a floating point number specifies the guide scale to use. The default value is 7.5

--seed followed by an integer allows setting a random seed (to obtain reproducible results). The default value is 42.

You can see the complete list of possible parameters with default values in the file. Now, let’s see a more complex build prompt using these optional parameters. txt2img.py

In the directory, create a file called . Create multiple prompts, one for each line of the file. For example: stable-diffusionprompts.txt

Now, in the terminal’s directory, run stable-diffusion

python scripts/txt2img.py \ --from-file prompts.txt \ --ckpt sd-v1-4.ckpt \ --outdir generated-images \ --skip_grid \ --ddim_steps 100 \ --n_iter 3 \ --H 256 \ --W 512 \ --n_samples 3 \ --scale 8.0 \ --seed 119

Two resulting images for each title can be seen below. The above command is intended to be used as an example of using more command line parameters, rather than an example of optimal parameters. In general, experience has found that larger images are of higher quality and image/title similarity, and that lower bootstrap ratios may produce better results. Continue to the next section to learn more about improving stable diffusion results.

How to install Stable Diffusion (CPU)

Step 1: Install Python

First, check if Python is installed on your system by typing in the terminal. If the Python version returns, continue to the next step. Otherwise, use python --version

sudo apt-get update yes | sudo apt-get install python3.8

Step 2: Download the repository

Now we need to clone the Stable Diffusion repository. We will use a fork that can accommodate CPU inference. In the terminal, execute the following command:

git clone https://github.com/bes-dev/stable_diffusion.openvino.git cd stable_diffusion.openvino

If you don’t have git, you’ll need to use .Be sure to read and accept the Stable Diffusion license before cloning the repository. sudo apt install git

Step 3: Installation Requirements

Install all necessary requirements

pip install -r requirements.txt

Note that Scipy version 1.9.0 is a listed requirement, but it is not compatible with older versions of python. You may need to change the Scipy version by editing, for example before running the above command. requirements.txtscipy==1.7.3

Step 4: Download Stable Diffusion Weights

Now that we are using Stable Diffusion in a proper environment, we need to download the weights required to run it. If you have not already read and accepted the Stable Diffusion License, be sure to do so now. Several Stable Diffusion checkpoint versions have been released. Higher version numbers have been trained on more data and generally perform better than lower version numbers. We will be using Checkpoint v1.4. Use the following command to download the weights:

curl https://f004.backblazeb2.com/file/aai-blog-files/sd-v1-4.ckpt > sd-v1-4.ckpt

That’s all the setup we need to get started with Stable Diffusion! Read on to learn how to use the model to generate images.

How to generate images with stable diffusion (CPU)

Now that everything is installed, we are ready to generate images using Stable Diffusion. To generate an image, just run the following command, change the prompt to whatever you want.

python demo.py --prompt "bright beautiful solarpunk landscape, photorealism"

Inference time is approximately8-12 minutes, so feel free to grab a cup of coffee while Stable Diffusion is running. Below we can see the output of running the above command:

Tips and Tricks

As you get started with Stable Diffusion, keep these tips and tricks in mind as you explore.

Quick Project

The results of text-to-image models can be sensitive to the wording used to describe the desired scene. Prompt engineering is the practice of tailoringpromptsto obtain desired results. For example, if the resulting image is of lower quality, try prefixing the title with “an image of.” You can also specify different styles and media to achieve different effects. Check out each of the drop-down menus below to get ideas:

Image type:

Try preceding the title with one of the following for a different effect:

"A photograph of" "A headshot of" "A painting of" "A vision of" "A depiction of" "A cartoon of" "A drawing of" "A figure of" "An illustration of" "A sketch of" "A portrayal of"

style

You can specify different styles to achieve different results. Try adding one or more of the following adjectives to the prompt and see the effect.

"Modernist(ic)" "Abstract" "Impressionist(ic)" "Expressionist(ic)" "Surrealist(ic)"

Aesthetics

You can also try specifying a different aesthetic. Try adding one or more of the following adjectives to the prompt and see the effect.

"Vaporwave" "Synthwave" "Cyberpunk" "Solarpunk" "Steampunk" "Cottagecore" "Angelcore" "Aliencore"

artist

You can even try assigning different artists to achieve different visual effects. Try appending one of the following options to the prompt:

Painter:

"in the style of Vincent van Gogh" "in the style of Pablo Picasso" "in the style of Andrew Warhol" "in the style of Frida Kahlo" "in the style of Jackson Pollock" "in the style of Salvador Dali"

Sculptor:

"in the style of Michelangelo" "in the style of Donatello" "in the style of Auguste Rodin" "in the style of Richard Serra" "in the style of Henry Moore"

Architect:

"in the style of Frank Lloyd Wright" "in the style of Mies van der Rohe" "in the style of Eero Saarinen" "in the style of Antoni Gaudi" "in the style of Frank Gehry"

Adjust sampling parameters

When adjusting sampling parameters, you can use the following empirical observations to guide your exploration.

Image size

Generally speaking, as a rule of thumb, larger images perform much better than smaller images in both image quality and title alignment. For tips on 256×256 and 512×512 size images“Guy Fieri gives a tour of a haunted house”, see the following example:

Comparison of 256×256 and 512×512 samples for the prompt “Guy Fieri Visits the Haunted Mansion”

Number of diffusion steps

The number of steps in the diffusion process does not seem to have much of an impact on the results beyond a certain thresholdof about 50 time steps. The image below was generated using the same random seed and prompt“A red sports car”. It can be seen that more time steps consistently improve the quality of the generated images, but the improvement over the past 50 time steps is only reflected in slight changes to the incidental environment of the object of interest. In fact, starting at time step 25, the details of the car are almost identical, and the environment is improving to become more suitable for the car at larger time steps.

Image aspect ratio

Image quality and caption similarity as a function of aspect ratio appear to depend on the input caption. The images below have the same area but different aspect ratios, both generated with the title“Steel and Glass Modern Architecture”. The results are relatively even, although vertical images look best, followed by squares and then horizontal images. This should come as no surprise given that modern buildings of this type are tall and skinny. Therefore, performance as a function of aspect ratio appears to be subject-dependent.

Unfortunately, stable diffusion is limited to resolvable aspect ratios, which makes finer-grained experiments impossible, but regardless, square images should be sufficient for most purposes.

Checkpoint symbolic link

To avoid having to provide a checkpoint every time you build an image, you can create a symbolic link between the checkpoint and the default value. In the terminal, navigate to the directory and execute the following command: --ckpt sd-v1-4.ckpt--ckptstable-diffusion

mkdir -p models/ldm/stable-diffusion-v1/ ln -s sd-v1-4.ckpt models/ldm/stable-diffusion-v1/ model.ckpt

Alternatively, simply move the checkpoint to the default location: --ckpt

mv sd-v1-4.ckpt models/ldm/stable-diffusion-v1/model.ckpt

Final words

That’s all you need to use the new Stable Diffusion model to generate images, if you want to learn more about how Stable Diffusion works, feel free to check out the 3D Modeling Learning Studio.

Reprint: How to run a stable diffusion model locally (mvrlink.com)