Some Stable Diffusion Research

Sep 5, 2024

I got curious how Stable Diffusion works.

Here’s some links:

Diffusion models, in short:

PNG -> Some “latest space”
Compressed with a “variational encoder / decoder”
Add noise according to a gaussian
Pass (prompt,noise,latent_img, timestep) into a U-Net
U-Net predicts noise that was added at that timestep
A scheduler/sampler (e.g. DDPM, Euler) subtract predicted noise
Pass through decoder

PNG -> Noise -> U-Net -> New PNG.

PNG -vaencoder> Latent image -gaussian_noise> latent+noise -unet> new_latent -vadecoder> New PNG