Generate 4K PBR textures with stable diffusion [SDXL]

I’m continuing the work I started last year using Three.JS to build 3D scenes and sketches.

At the time, AI image generators like DALL-E and Stable Diffusion were just really taking off. I successfully ran Stable Diffusion locally and used it to generate textures for terrain, buildings, and other environments in a 3D world I was building.

At the time I was using Stable Diffusion v1. I’ve found some tips that I like that create images with a style like this:

Recommended: Use NSDT editor to quickly build programmable 3D scenes

SDv1 is very good at generating textures like this, I’ve generated hundreds/thousands of images using different prompts. If you are a Three.js developer, you can also use DreamTexture.js, a 3D model automatic texturing development package, which can automatically paste the texture generated by stable diffusion onto the 3D model, which is very convenient:

It’s now August 2023, and the AI image generation ecosystem continues to advance by leaps and bounds. I’ve been experimenting with using Stable Diffusion XL to generate textures, starting with most terrains, and I’ve been getting some great results.

At first, I got some frustrating results because the tips I had used previously didn’t work well. However, after some experimentation I got results that were much better than anything I had achieved in the past.

1. SDXL v1.0 PBR texture generation workflow

I’ve come up with a pretty solid workflow for generating high-quality 4K PBR textures using Stable Diffusion XL and a few other tools. I’m really impressed with the results so far, and I’ve only been working on this stuff for a few days now.

This is the result I’m able to get:

2. Use 7900 XTX to install and run SDXL

I recently bought a 7900 XTX graphics card. It has 24GB of VRAM, which is enough to produce 1024×1024 images with stable dispersion without the need for upgrades or other tricks.

I chose AUTOMATIC1111 WebUI to install and run Stable Diffusion. It seems to be the most feature-rich and popular, and it supports AMD GPUs out of the box. I cloned the repository, ran the ./run_webui.sh script and it automatically installed dependencies and downloaded stable diffusion weights and other model files.

In order to run it, I need to do some manual work. I got a segfault on the first run, but I was able to fix it by setting two environment variables:

export CL_EXCLUDE=gfx1036
export HSA_OVERRIDE_GFX_VERSION=11.0.0

I’m currently using ROCm v5.5 and I believe this is why this is needed. This release adds support for AMD 7000 series GPUs, but support is not yet complete. These environment variables trick Pytorch and other libraries into thinking it’s a different model and making it work.

I’ve previously written some instructions on installing ROCm and TensorFlow using the 7900 XTX. However, they may be outdated.

Anyway, after installing and exporting these environment variables, the Web UI starts and generates the image!

3. SDXL prompt words

The first step in the process is to find some good texture generation tips. This is certainly the most creative part of the whole thing and has the biggest impact on your results.

I don’t have much guidance here; it depends a lot on what you want to generate and the style you want to achieve. However, here is an example of a tip that I have had a lot of success with:

top-down image of rough solid flat dark, rich slate rock. interspersed with bright ((flecks)) of ((glinting)) metallic spots like mica. high quality photograph, detailed, realistic

Negative tips:

blurry, cracks, crevice, round, smooth, distinct, pebble, shadows, coin, circle

Note the brackets around some words. This is a feature of the AUTO1111 webui I’m using that tells the model to focus more on the hint parts around them. There are some other fancy prompt syntax tricks; tons of flexibility and options to explore.

4. SDXL parameters

The next step is to come up with good parameters for image generation.

I found that Stable Diffusion XL is actually more sensitive to selected parameters than older versions. The bottom line is, there’s a lot more to adjust than there was a year ago. As a result, it did take me a long time to come up with settings that produced images that I was happy with.

To solve this problem, I made extensive use of the “X/Y/Z plot” feature of AUTO1111 webui. It’s in the “Scripts” dropdown in the user interface:

It generates a grid of output images, where each cell is generated using a different combination of parameters. The output when using it looks like this:

It was very useful in determining a good set of baseline parameters for my images.

After a lot of experimenting, here are the baseline parameters I now use when generating new textures:

  • Sampling method: Euler
  • Number of sampling steps: 60
  • Width/Height: 1024×1024
  • CFG Scale: 6.5 (I find it especially important to get this right)
  • Tiling: Enabled (required to generate seamlessly tiled textures; extremely important)
  • Fix: Disabled (everything I tried with this enabled had no luck)

You may need to experiment on your own to find the parameters that work for your use case, but these may serve as a good baseline when starting out.

Another thing I’ve observed is that enabling tiling results in the resulting image being significantly different overall. For example, this can get tricky when using Stable Diffusion Dream Studio to generate images online to take advantage of its fast GPU, and then trying to replicate those results locally. There are no tiling options in their hosting UI, nor do they allow adjusting parameters such as samplers, making it difficult to get an exact match.

5. Convert texture from 1K to 4K

Once I find a good tip and good parameters, I just set it up to generate a bunch of images and that’s it! I let it run for about 2 hours and selected about 50 images. While many are bad, there are more than enough gems in the pile to work with.

One thing you may have noticed is that I mentioned “4K textures” in the title of this post, but all the images I’ve generated so far have been 1K. Well, if you think I’m going to say I did something along the upgrade route using one of many possible methods, you’re actually wrong!

I created a unique method of combining AI-generated textures to produce a higher resolution output. The resulting texture retains the seamless/infinite tile properties of the source image.

I built a tool that runs in the browser to do this easily, you can use it here , 100% free and open source.

The process is very simple. You drag and drop 4 similar-looking seamlessly tiled Stable Diffusion generated textures and combine them together. As I mentioned, the output will also be seamlessly tiled, with 4x the resolution.

The tool’s UI looks like this:

You can download the result in PNG format by right-clicking on the generated image and selecting “Save Image As”. The output by default is quite large, so you may need to use a tool like sqoosh to compress/optimize it.

Compared to stable diffusion and other AI tools, its implementation is actually quite low-tech. But I found the overall effect to be very good!

It works best with images that look very similar. If there are large differences in style, color, etc., it may be noticeable in the resulting output and look bad.

A good way to get a source image that looks very similar is to find a generated image you like and then generate a variation of it. To do this, upload the generated image to the “PNG Info” tab of the AUTO1111 Web UI. Then, click the “Send to txt2img” button and it will pre-populate the UI with all the parameters used to generate it.

Then, check the “Extra” checkbox on the txt2img tab. Set the Variation Strength to a small value, such as 0.1-0.3, and generate a dozen or so images. Overall they should look similar to the original version, but with different details. If you select 4 of these and upload them to the Seamless Texture Stitcher tool, I find the output usually looks good.

6. Build PBR texture

Once you have your stitched 4K images that you’re happy with, it’s time to turn them into full PBR textures.

To do this, I used a tool called Poly. They provide an AI-powered texture generator that takes an image as input and generates normal, height, ambient occlusion, metallicity, and roughness maps for it. They also offer their own prompt-based generation tool, but I personally prefer the control it provides by generating it myself.

Their tool lets you generate normal and height maps for free, but they charge you $20 per month to generate other maps. I personally do pay for that subscription now, but you don’t have to pay to get good results. The normal map is the most important part, you can set global roughness/metallicity values for the texture (which is what you will need a lot of the time), or write a custom shader to generate them dynamically from pixel values.

Another method that can be used to generate normal maps is SmartNormap. It uses non-artificial intelligence methods to programmatically generate normal maps for any source image. It has some parameters that can be adjusted, and the results are decent, but overall not as good as the AI-powered Poly tool in my experience.

Anyway, yes, then you can download the textures and insert them into Blender, Three.JS or any other 3D software you want.

If you want to get even crazier with it, the seamless 4K output texture can be used with a hex tiling algorithm, making it infinitely tileable without any duplication. I’m developing a Three.JS library to automate this problem, and here’s a layout I generated on some terrain in Blender:

7. Conclusion

It really feels like the floodgates are opening here. I personally find that using AI images to generate textures and other building block assets rather than full images or artwork is the way to go. There was a lot of room for my creativity and input to guide the entire process. There are endless possibilities to explore, and the whole process is quite interesting in my opinion.

Regardless, I hope you find this helpful and good luck if you decide to try it out for yourself.

Original link: Generating 4K PBR textures with SDXL – BimAnt