Recommended Stable Diffusion automatic texture tool:
DreamTexture.js automatic texturing development kit
What is InPainting?
Image restoration is an active area of artificial intelligence research, and AI is already able to come up with restorations that are better than most artists.
This is a way of generating an image where the missing parts have been filled in with visually and semantically sound content. It can be quite useful for many applications such as advertising, improving your future Instagram posts, editing and repairing your AI-generated images, and it can even be used to repair old photos. There are many ways to perform inpainting, but the most common method is to use a convolutional neural network (CNN).
CNN is well suited for inpainting because it can learn features of the image and can use these features. There are many different CNN architectures available for this purpose.
Introduction to Stable Diffusion
Stable Diffusion is a latent text-to-image diffusion model capable of generating stylized and realistic images. It is pre-trained on a subset of the LAION-5B dataset, and the model can be run on consumer-grade graphics cards at home, so everyone can create stunning works of art in seconds.
How to repair using stable diffusion
This tutorial helps you make tip-based repairs without painting masks using Stable Diffusion and Clipseg. In this case, the mask is a binary image that tells the model which part of the image to draw and which part to keep. A further requirement is that you need a good GPU, but it will also run fine on a Google Colab Tesla T4.
There are 3 mandatory inputs required to perform InPainting.
- Enter image URL
- Tips for entering parts to replace in the image
- Output prompt
You can adjust some parameters
- Mask accuracy
- Stable diffusion generation intensity
If this is your first time using Hugging Face’s Stable Diffusion, you will need to accept the ToS on the model page and get your Token from your user profile
So let’s get started!
Install the open source Git extension for version control of large files
! git lfs install
Clone the clipseg repository
! git clone https://github.com/timojl/clipseg
Install the diffuser package from PyPi
! pip install diffusers -q
Install more helpers
! pip install transformers -q -UU ftfy gradient
Install CLIP using pip
! pip install git + https://github.com/openai/CLIP.git -q
Now let’s move on to logging in using Hugging Face. To do this, just run the following command:
from huggingface_hub import notebook_login notebook_login()
Once the login process is complete, you will see the following output:
Login successful Your token has been saved to /root/.huggingface/token
? clipseg
! ls
datasets metrics.py supplementary.pdf environment.yml models Tables.ipynb evaluation_utils.py overview.png training.py example_image.jpg Quickstart.ipynb Visual_Feature_Engineering.ipynb experiments Readme.md weights general_utils.py score.py LICENSE setup.py
import torch import requests import cv2 from models.clipseg import CLIPDensePredT from PIL import Image from torchvision import transforms from matplotlib import pyplot as plt from io import BytesIO from torch import autocast import requests import PIL import torch from diffusers import StableDiffusionInpaintPipeline as StableDiffusionInpaintPipeline
Load model
model = CLIPDensePredT(version='ViT-B/16', reduce_dim=64) model.eval();
model.load_state_dict(torch.load('/content/clipseg/weights/rd64-uni.pth', map_location=torch.device('cuda')), strict=False);
Not strict since we only store the decoder weights (not the CLIP weights)
device = "cuda" pipe = StableDiffusionInpaintPipeline.from_pretrained( "CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16, use_auth_token=True ).to(device)
Alternatively, you can load the image from an external URL as follows:
image_url = 'https://okmagazine.ge/wp-content/uploads/2021/04/00-promo-rob-pattison-1024x1024.jpg' input_image = Image.open(requests.get(image_url, stream=True).raw) transform = transforms.Compose([ transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), transforms.Resize((512, 512)), ]) img = transform(input_image).unsqueeze(0)
Back to Contents
? ..
Convert input image
input_image.convert("RGB").resize((512, 512)).save("init_image.png", "PNG")
display image with help of plt
from matplotlib import pyplot as plt plt.imshow(input_image, interpolation='nearest') plt.show()
This will display the following image:
Now we’ll define a hint for the mask, then make predictions, and then visualize the predictions:
prompts = ['shirt']
with torch.no_grad(): preds = model(img.repeat(len(prompts),1,1,1), prompts)[0]
_, ax = plt.subplots(1, 5, figsize=(15, 4)) [a.axis('off') for a in ax.flatten()] ax[0].imshow(input_image) [ax[i + 1].imshow(torch.sigmoid(preds[i][0])) for i in range(len(prompts))]; [ax[i + 1].text(0, -15, prompts[i]) for i in range(len(prompts))];
Now we have to convert this mask into a binary image and save it as a PNG file:
filename = f"mask.png" plt.imsave(filename,torch.sigmoid(preds[0][0])) img2 = cv2.imread(filename) gray_image = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY) (thresh, bw_image) = cv2.threshold(gray_image, 100, 255, cv2.THRESH_BINARY) # For debugging only: cv2.imwrite(filename,bw_image) # fix color format cv2.cvtColor(bw_image, cv2.COLOR_BGR2RGB) Image.fromarray(bw_image)
Now we have a mask that looks like this:
Now load the input image and created mask
init_image = Image.open('init_image.png') mask = Image.open('mask.png')
And finally, the last step: fix it according to the prompts of your choice. Depending on your hardware, this will take a few seconds.
with autocast("cuda"): images = pipe(prompt="a yellow flowered holiday shirt", init_image=init_image, mask_image=mask, strength=0.8)["sample"]
On Google Colab, you can print out an image simply by entering its name:
images[0]
Now you’ll see that the shirt we created the mask for is replaced by our new prompt!
Reprint: Quickly repair images using stable diffusion (mvrlink.com)