Skip to content

madaror/tiled-diffusion

Repository files navigation

Tiled Diffusion

A novel approach for generating seamlessly tileable images using diffusion models.

Or Madar, Ohad Fried
Reichman University
Image tiling—the seamless connection of disparate images to create a coherent visual field—is crucial for applications such as texture creation, video game asset development, and digital art. Traditionally, tiles have been constructed manually, a method that poses significant limitations in scalability and flexibility. Recent research has attempted to automate this process using generative models. However, current approaches primarily focus on tiling textures and manipulating models for single-image generation, without inherently supporting the creation of multiple interconnected tiles across diverse domains. This paper presents Tiled Diffusion, a novel approach that extends the capabilities of diffusion models to accommodate the generation of cohesive tiling patterns across various domains of image synthesis that require tiling. Our method supports a wide range of tiling scenarios, from self-tiling to complex many-to-many connections, enabling seamless integration of multiple images. Tiled Diffusion automates the tiling process, eliminating the need for manual intervention and enhancing creative possibilities in various applications, such as seamlessly tiling of existing images, tiled texture creation, and 360° synthesis.


Installation

conda create -n td python==3.10
conda activate td
pip install --upgrade pip
pip install -r requirements.txt

Quick Start

import torch
import matplotlib.pyplot as plt
import numpy as np
from latent_class import LatentClass
from model import SDLatentTiling

model = SDLatentTiling()
device = 'cuda' if torch.cuda.is_available() else 'cpu'
prompt_1 = "Red brick texture"
prompt_2 = "Green brick texture"
negative_prompt = "blured, ugly, deformed, disfigured, poor details, bad anatomy, pixelized, bad order"
max_width = 32 # Context size (w)

# Many-to-many example on the X axis
lat1 = LatentClass(prompt=prompt_1, negative_prompt=negative_prompt, side_id=[1, 1, None, None],
                   side_dir=['cw', 'ccw', None, None])

lat2 = LatentClass(prompt=prompt_2, negative_prompt=negative_prompt, side_id=[1, 1, None, None],
                   side_dir=['cw', 'ccw', None, None])
latents_arr = [lat1, lat2]

new_latents_arr = model(latents_arr=latents_arr,
                        negative_prompt=negative_prompt,
                        max_width=max_width,
                        device=device)

lat1_new = new_latents_arr[0]
lat2_new = new_latents_arr[1]
t_1 = np.concatenate((lat1_new.image, lat2_new.image, lat2_new.image, lat1_new.image),
                     axis=1)

plt.imshow(t_1)
plt.show()

Usage

python run.py

Under the file run.py, configure the latents using the class LatentClass, where side_id is a list of ids (i.e., the color patterns in the teaser for the constraints) and the side_dir is a list of orientations ('cw' or 'ccw' for clockwise or counter-clockwise). The list is always of size 4, with indexes corresponding to the sides (Right, Left, Up, Down). A simple example could be:

side_id=[1, 1, None, None],
side_dir=['cw', 'ccw', None, None]

This means that the right and left sides should connect (the connection pattern must be from different orientations). After the model finishes, the LatentClass will hold an attribute called image which is the result image of the diffusion process. For ControlNet / Differential Diffusion / SD 3 / SD XL, please follow the corresponding directories (controlnet, diffdiff, sd3, sdxl) under the file name example.py (the same name under each directory).

Examples

Here we provide declaration examples of self-tiling, one-to-one and many-to-many scenarios. We also include an example for img2img.

Self-tiling

lat1 = LatentClass(prompt=PROMPT, negative_prompt=NEGATIVE_PROMPT, side_id=[1, 1, 2, 2],
                   side_dir=['cw', 'ccw', 'cw', 'ccw'])

This example represents a self-tiling scenario where I1 would seamlessly connect to itself on the X / Y axis.

One-to-one

lat1 = LatentClass(prompt=PROMPT1, negative_prompt=NEGATIVE_PROMPT1, side_id=[1, 2, None, None],
                   side_dir=['cw', 'ccw', None, None])
lat2 = LatentClass(prompt=PROMPT2, negative_prompt=NEGATIVE_PROMPT2, side_id=[2, 1, None, None],
                   side_dir=['cw', 'ccw', None, None])

This example represents a one-to-one scenario where I1 and I2 could connect to each other on the X axis.

Many-to-many

lat1 = LatentClass(prompt=PROMPT1, negative_prompt=NEGATIVE_PROMPT1, side_id=[1, 1, None, None],
                   side_dir=['cw', 'ccw', None, None])
lat2 = LatentClass(prompt=PROMPT2, negative_prompt=NEGATIVE_PROMPT2, side_id=[1, 1, None, None],
                   side_dir=['cw', 'ccw', None, None])

This example represents a many-to-many scenario where I1 and I2 could connect to each other and to themselves on the X axis.

Img2img

Uncomment the img2img lines within the file run.py:

url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"
response = requests.get(url)
input_image = Image.open(BytesIO(response.content)).convert("RGB")
input_image = input_image.resize((768, 512))
lat1 = LatentClass(prompt=PROMPT, negative_prompt=NEGATIVE_PROMPT, side_id=[1, 1, None, None],
                   side_dir=['cw', 'ccw', None, None], source_image=input_image)

When adding the flag source_image, and attaching to it a PIL image, the code will automatically detect and encode it with the VAE to start with that representation in the latent space, instead of using random gaussian noise. The result would be a transformed tiled image on the X axis. (Notice this is a general img2img and not the application Tiling Existing Images. To use the application please refer the file example.py under the folder diffdiff)

BibTeX

@misc{madar2024tileddiffusion,
      title={Tiled Diffusion}, 
      author={Or Madar and Ohad Fried},
      year={2024},
      eprint={2412.15185},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2412.15185}, 
}