Stable Diffusion: A state-of-the-art image model for text generation

Stable diffusion

Generative AI technology is advancing rapidly and can now generate text and images simply from text input. Stable Diffusion is a text-to-image model that enables you to create photorealistic applications.

Diffusion models are trained by learning to remove noise added to real images. This noise reduction process produces lifelike images. These models can also generate images from text alone by conditioning the text generation process. For example, stable diffusion is a type of latent diffusion where the model learns to recognize shapes in pure noise images and gradually brings those shapes into focus if they match words in the input text. The text must first be embedded into the latent space using a language model. Then, the U-Net architecture is used to perform a series of noise addition and elimination operations in the latent space. Finally, the denoised output is decoded into pixel space.

Here are some examples of input text and corresponding output images generated by Stable Diffusion.

The following images are responses to the input: “Photo of astronauts riding horses on Mars,” “Impressionist-style painting of New York City,” and “Dog in suit.”

The following images are responses to inputs: (i) a dog playing poker, (ii) a color photo of a castle in a forest with trees, and (iii) a color photo of a castle in a forest with trees. Negative tip: yellow

Toolkit developed based on stable diffusion AI model

DreamTexture.js automatic texturing development kit is a 3D model texture automatic generation and setting development kit based on the Stable Diffusion AI model, which can add fast automatic texturing capabilities of 3D models to webGL applications.

Figure 1 is the original model, Figure 2 and Figure 3 are the model after texture. Prompt words: city, Realistic, cinematic, Front view, Game scene graph

1. DreamTexture.js development package content

DreamTexture.js is developed based on Three.js and stable diffusion AI model. It is used to realize automatic texturing of 3D models. Of course, version V1.0, the main file and directory organization structure is as follows:

Development package file	Description
dream-texture.cjs	cjs format library file
dream-texture.esm	esm format library file
dream-texture.umd	umd format library file
stable-diffusion-guide.md	is used DreamTexture.js Stable Diffusion Service Installation Guide
LICENSE.md	Development Kit License Agreement File
example/	DreamTexture.js uses the example directory

2. Get started quickly with DreamTexture.js development kit

Taking the ESM library as an example, we will introduce how to use the DreamTexture.js development package to add automatic texturing capabilities for 3D models to Three.js applications.

First, refer to the stable diffusion service installation guide in the development package to deploy your own stable diffusion api service, which supports windows and Linux.

Next, install the three.js development environment. After the installation is completed, you need to import the DreamTexture.js library file. Taking the ESM library as an example, the import code is as follows:

import * as THREE from 'three';
import DreamTexture from './dream-texture.esm.min';

Now create a scene, import the GLTF model into the scene, and rotate or move the model appropriately:

//Import the model into the scene
const gltfLoader = new THREE.GLTFLoader();
gltfLoader.load('monkey.glb', async (e) => {
  scene.add(e.scene);
});

// Rotate the model to any angle you want!
box.rotation.y = -Math.PI / 4;

Then instantiate a DreamTexture object, making sure to specify the URL of your stable diffusion API service in the parameters:

//Initialize the DreamTexture object and pass in your stable diffusion api address
const dt = new DreamTexture({
  baseUrl: 'http://127.0.0.1:7860', //stable diffusion url
});

Now you can call the setTexture method of the DreamTexture object and pass in parameters such as prompt words, so that the AI model can automatically generate a texture image and project it onto the model. The code is as follows:

//Write prompt words and other parameters
// After successfully starting the stable diffusion api, you can view the documentation at http://127.0.0.1:7860/docs
const params = {
    prompt: 'monkey head, Brown hair, cartoon',//The more detailed the description of the required image, the closer the Stable Diffusion generation effect is to the description, and the less description, the more creative it is.
    negative_prompt: 'blurry', //Content that is not expected to be generated by Stable Diffusion is used to exclude unnecessary elements.
    denoising_strength: 0.85,//Denoising strength
    cfg_scale: 15,//Text CFG scale
    image_cfg_scale: 7,//Picture CFG ratio
    steps: 10,//Number of sampling steps
    sampler_index: 'DPM + + SDE Karras',
    sampler_name: '',
};
dt.setTexture(scene, params).then((res) => {
  console.log('Texture added successfully!');
});

The automatic texturing effect of the 3D model is as follows:

Case 1:

aa2

Figure 1 is the original model, Figure 2 and Figure 3 are the model after texture. Prompt words:

car, Realistic, photography, hyper quality, high detail, high resolution, Unreal Engine, Side view

Case 2:

aa1

Figure 1 is the original model, Figure 2 and Figure 3 are the model after texture. Picture 2 prompt words:

Realistic, photography, bottle, porcelain

Figure 3: Replace ‘porcelain’ with ‘glass’

3. Use of DreamTexture.js development package cjs/umd library files

DreamTexture supports three commonly used js library formats. In addition to the esm format introduced earlier, it also supports cjs and umd formats:

The introduction code of the cjs library is as follows:

const ProjectedMaterial = require('./dream-texture.cjs.js');

The introduction code of the umd library is as follows:

<script src="./three.js"></script>
<script src="./dream-texture.umd.js"></script>

4. DreamTexture.js development kit API interface description

The API interface of DreamTexture.js is very simple and is described as follows:

new DreamTexture({ baseUrl })

Initializes a DreamTexture object, later used for automatic texturing of 3D models.

Parameter	Description
`baseUrl`	stable diffusion api address

dreamTexture.setTexture(object3d:THREE.Object3D, params)

DreamTexture will use the front view of the incoming object3d as a basis to complete automatic texturing of the 3D scene, including texture generation and automatic projection.

Parameter	Description
`object3d`	THREE.Object3D. Support Group and Mesh.
`params`	Stable diffusion img2img api parameters

Reprint: Stable Diffusion: The most advanced text generation image model (mvrlink.com)