Denoising Steps: Gradual Clarification
The model removes noise across multiple steps to restore the image.
Instead of finishing at once, it gradually refines toward the target image.
Adjust the step count with the slider and press Play to see the process.

Start from Noise: Unique Fingerprint
Diffusion 'excavates' images from random noise.
Different starting numbers create different noise patterns, resulting in completely different images.
Two starting numbers compared - same prompt, different starting points yield different images.


Scroll to watch the noise clear away
Text Conditioning: Text Guide
Text is converted to numbers, and those numbers guide image creation.
At each denoising step, text embedding guides the noise removal direction.
See the prompt → CLIP encoder → embedding → Cross-Attention → result pipeline.





"A cat sitting on a rainbow"
Text embedding influences noise removal direction at each step, converging toward an image matching the description.
Control Signals: Steering Wheel
ControlNet adds extra conditions like pose, depth, and edges.
When text isn't enough, image-based control signals provide precise guidance.
Click the 4 control types to see the workflow visualization.
Human skeleton guides body position and pose


Diffusion in One Line
“From noise to art, guided by text and control.”